Simplex Logo

Implementation Manual Sir Cache

home > Implementation Manual Sir Cache

Implementation Manual for Sir Cache

Implementation Manual Sir Cache

Efficiently indexing web pages ensures visibility on search engines. However, sites relying heavily on JavaScript render struggle to be properly crawled.

Sir Cache solves this problem by generating static versions of pages. It guarantees fast responses and improves indexing performance through Dynamic Rendering (a best practice recommended by Google).

You can flexibly implement Sir Cache by integrating it with CDNs like Cloudflare and CloudFront or configuring it directly on web servers such as Apache and Nginx. The CDN-level integration offers the best and most efficient results.

This guide explains the requirements, configuration steps, and practical examples for installing Sir Cache in your infrastructure. It also provides installation models for different CDNs and allows you to adapt the logic for other formats as needed.


Installation

Prerequisites

To install Sir Cache on your website, you need to complete these tasks:

1. Detect when Googlebot-Mobile accesses your site.

2. Transparently modify the HOST when a request comes from Googlebot-Mobile (change the actual domain called).

3. Create exceptions for resources you don't want to serve statically (this keeps product prices updated for Google Shopping) or when the User-Agent contains a special indicator (to avoid looping).

4. Include a string in a user-agent or an IP range in a safelist if your infrastructure uses a WAF.


Here's how the basic operation works if your site is https://www.domain.com.br/ and your Sir Cache account returns data from https://i4c-domain.simplex.live/:

1. Googlebot-Mobile requests the URL: https://www.domain.com.br/infos/about-us.html

2. Your infrastructure detects the Googlebot request and routes it like a transparent proxy, replacing the HOST www.domain.com.br with i4c-domain.simplex.live.

3. Your infrastructure then requests https://i4c-domain.simplex.live/infos/about-us.html (forwarding all client headers like user-agent, cookies, etc.)

4. Simplex's infrastructure receives this request and sends back the optimized static version of the page to your infrastructure.

5. Finally, your infrastructure delivers the received content to Googlebot-Mobile at the requested initial URL: https://www.domain.com.br/infos/about-us.html


Important! This process does not perform a 301 or 302 redirect. Throughout the process, the browser's URL stays as https://www.domain.com.br/infos/about-us.html while the content comes from https://i4c-domain.simplex.live/infos/about-us.html.


Preventing Looping Risk

When Sir Cache receives a request for https://www.domain.com.br/infos/about-us.html but hasn't generated a static version yet, it must fetch the page content from your origin and create a static version.

Sir Cache sends this request to your origin using the original headers (User-Agent, Cookies, etc.), which might include the Googlebot-Mobile user-agent. This situation can cause an infinite loop of requests.

To prevent this risk, Sir Cache adds "SimplexBypass" to the user agent when requesting your origin. This addition stops the looping.




Practical Examples

Example 1 - Integration with the Cloudflare CDN

Here is the typical code for a Cloudflare Worker


Example 2 – Integration with CloudFront CDN

Step-by-Step Configuration Example

Step 1: Create a Lambda Function

1. Open the AWS Lambda Console.

2. Create a new Lambda function and choose "Author from scratch."

3. Name your function and select a runtime (Node.js or Python are common choices).

4. Under "Permissions," select or create an execution role with basic Lambda permissions and access to CloudFront logs.

5. Click "Create function."


Step 2: Write the Function Logic

Below is a sample Node.js function that checks the User-Agent and modifies requests as needed. This example assumes you want to change the request URI or the host header based on detecting Googlebot-Mobile access:


Step 3: Deploy the Lambda Function to the Edge

1. In the Lambda console, select Actions > Deploy to Lambda@Edge.

2. Configure the deployment settings:

3. CloudFront Distribution: Select your distribution.

4. CloudFront Event: Choose Origin Request to modify requests before they reach the origin. This lets you conditionally route requests based on the Googlebot-Mobile User Agent.

5. Cache Behavior: Specify which cache behavior this applies to, or select Default (*) to use it for all requests.

6. Confirm the deployment.


Step 4: Test the Configuration

After deploying your function to Lambda@Edge, allow a few minutes for the changes to propagate globally. Test your website to ensure requests are modified or routed as expected based on the Googlebot-Mobile User Agent.


Example 3 – Integration with Apache2

To implement logic similar to the Cloudflare Worker using an Apache 2 server, use URL rewriting and configure reverse proxies with Apache's mod_rewrite or mod_proxy modules. This approach lets you detect the User-Agent of incoming requests and, if it matches Googlebot-Mobile without containing "SimplexBypass," forward them to another domain or subdomain.


Prerequisites

Apache 2 Installed: Ensure your system has the Apache server installed.

Enable Modules: Activate the mod_rewrite and mod_proxy modules (and mod_proxy_http if you plan to proxy HTTP content). You can enable them using the a2enmod command on Debian/Ubuntu systems:

Accessing the Configuration File:

You can access your site's configuration file, usually located at /etc/apache2/sites-available/ or through a .htaccess file in its root directory.


Configuration Example:

Add the following lines to your Apache configuration file or .htaccess:


Example 4 – Integration with Nginx

Prerequisites

Nginx Installed: Ensure you have Nginx installed on your server.

SSL Configuration: If you handle https:// requests, ensure SSL is correctly configured for your domain, including obtaining SSL certificates (for example, via Let's Encrypt).

Access to Configuration Files: You need access to your Nginx configuration files, usually located in /etc/nginx/sites-available/ or /etc/nginx/nginx.conf for global settings.


Configuration Example