logo of Akmatori
22.10.2023

How to close site from indexing on Kubernetes ingress

head-image-old

In Kubernetes, the Ingress resource acts as a traffic controller that enables external access to services within a cluster. While Ingress is essential for routing incoming traffic to your site, there may be instances where you want to prevent search engines from indexing your site's content. This blog post will guide you through the steps to close your site from indexing on Kubernetes Ingress, ensuring that search engine bots won't crawl and index your pages.

Prerequisites

To follow this tutorial, you should have a basic understanding of Kubernetes and be familiar with managing Ingress resources. Additionally, you'll need access to the Kubernetes cluster and the necessary permissions to make configuration changes.

Step 1: Create an Ingress Resource

Before you can proceed with preventing indexing, you must have an Ingress resource defined for your site. If you haven't set it up yet, create an Ingress resource by specifying the necessary routing rules and associating it with your service. Find an example below:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    kubernetes.io/ingress.class: nginx
  name: akmatori-front-ingress
  namespace: akmatori-preprod
spec:
  rules:
  - host: preprod.akmatori.com
    http:
      paths:
      - backend:
          service:
            name: akmatori-front
            port:
              number: 80
        path: /
        pathType: Prefix
  tls:
  - hosts:
    - preprod.akmatori.com
    secretName: akmatori-front-ssl

Step 2: Modify the Ingress Configuration

To prevent search engines from indexing your site, you need to add specific annotations to your Ingress resource. Annotations are key-value pairs that provide additional instructions to the Ingress controller. In this case, we'll use the "nginx.ingress.kubernetes.io/configuration-snippet" annotation to control indexing behavior.

Open your Ingress resource YAML file and add the following annotation under the metadata section:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/configuration-snippet: |
      location /robots.txt {
        return 200 "User-agent: *\nDisallow: /\n";
      }
spec:
  ...

The value of the nginx.ingress.kubernetes.io/configuration-snippet annotation contains the contents of the /robots.txt path. The User-agent: * line specifies that the following rules apply to all search engine bots, and the Disallow: / line instructs the bots to not crawl any page on your site.

Step 3: Apply the Configuration Changes

Save the modified Ingress resource file and apply the changes to your Kubernetes cluster using the kubectl apply command:

kubectl apply -f your-ingress-file.yaml

The Ingress controller will detect the changes and update the configuration accordingly.

Step 4: Verify the Configuration

To ensure that the indexing prevention is working correctly, you can inspect the generated robots.txt file. The Ingress controller automatically generates this file based on the provided annotation.

Retrieve the external IP or domain associated with your Ingress resource and append /robots.txt to the URL. For example:

curl https://preprod.akmatori.com/robots.txt

The response should display the contents specified in the annotation, similar to the following:

User-agent: *
Disallow: /

Step 5: Test Indexing Prevention

To confirm that search engines are not indexing your site, you can perform a test by searching for your site on popular search engines. Keep in mind that search engine results might not be updated instantaneously, so allow some time for the indexing status to be updated.

Conclusion

Preventing search engine indexing on Kubernetes Ingress is a straightforward process with the help of annotations. By adding the appropriate annotation to your Ingress resource, you can ensure that search engine bots won't crawl and index your site's content. Remember to verify the robots.txt file and perform tests to confirm that the indexing prevention is working as intended.

While controlling search engine access is a crucial aspect of managing your online presence, ensuring your site's performance and security across global networks is equally important. That's where Akmatori - A Globally Distributed TCP/UDP Balancer comes in.

Akmatori enhances your Kubernetes infrastructure by optimizing traffic management, reducing latency, and improving security across your deployments. By integrating Akmatori into your stack, you not only safeguard your content from unwanted indexing but also ensure that your site remains fast, reliable, and secure for users worldwide.

Elevate your Kubernetes strategy by incorporating Akmatori into your architecture. Discover how easy it is to manage traffic, enhance performance, and secure your applications at a global scale. Start optimizing your network with Akmatori today and experience the full potential of your Kubernetes deployments.

Maximize your website or application's performance and reliability!