Darmowy szablon automatyzacji

Generuj pliki llms.txt gotowe na sztuczną inteligencję z indeksowania witryn Screaming Frog

799

8 mies. temu

bloków

Opis workflow automatyzacji generowania pliku llms.txt

Ten szablon automatyzacji pomaga w generowaniu pliku llms.txt na podstawie eksportu z narzędzia Screaming Frog. Screaming Frog to popularne narzędzie do crawlowania stron internetowych.

Jak to działa

Formularz pozwala wprowadzić następujące dane:

Nazwę strony internetowej
Krótki opis
Plik internal_html.csv z eksportu Screaming Frog

Po przesłaniu formularza workflow jest automatycznie uruchamiany, a gotowy plik llms.txt można pobrać bezpośrednio z interfejsu n8n.

Pobieranie pliku

Ostatni węzeł w tym workflow to "Konwertuj do pliku", więc plik należy pobrać bezpośrednio z interfejsu n8n. Można jednak łatwo dodać węzeł (np. Google Drive, OneDrive) do automatycznego przesłania pliku w wybrane miejsce.

Filtrowanie z wykorzystaniem AI (opcjonalne)

Workflow zawiera węzeł klasyfikatora tekstu, który jest domyślnie dezaktywowany. Można go aktywować, aby zastosować inteligentniejsze filtrowanie przy wyborze URL do pliku llms.txt. Warto zmodyfikować opis w węźle klasyfikatora, aby precyzyjniej określić typ URL, które mają być uwzględnione.

Instrukcja użycia

Przecrawluj stronę internetową w Screaming Frog
Wyeksportuj sekcję "internal_html" w formacie CSV
W n8n kliknij "Test Workflow", wypełnij formularz i załaduj plik internal_html.csv
Po zakończeniu workflow przejdź do węzła "Export to File" i pobierz wynik

Przykłady zastosowań

Ta automatyzacja znajduje zastosowanie w wielu scenariuszach związanych z optymalizacją stron internetowych i zarządzaniem treścią:

Generowanie plików wykluczeń dla narzędzi SEO
Tworzenie mapy strony dla wyszukiwarek
Automatyzacja procesu audytu strony
Przygotowanie danych do analizy struktury witryny
Optymalizacja indeksowania przez roboty wyszukiwarek
Tworzenie list URL do testów A/B
Generowanie raportów dla klientów z listą podstron

Zalecane użycie

Najwygodniej jest używać tego workflow bezpośrednio w interfejsie n8n, klikając 'Test Workflow' i przesyłając plik przez formularz.

Inne możliwości automatyzacji

Zautomatyzuj ankiety CSAT za pomocą Freshdesk i przechowuj odpowiedzi w Arkuszach Google

Zarządzanie Kalendarzem Google z uwzględnieniem kontekstu przy użyciu protokołu MCP

Przepływ pracy eksportu wiadomości e-mail z Gmaila do Dysku Google

Dynamiczny router modelu AI do optymalizacji zapytań z OpenRouter

Posty na LinkedIn generowane przez AI z OpenAI, Arkuszami Google i przepływem pracy zatwierdzania wiadomości e-mail

Generuj faktury, zapisuj na dysku i wysyłaj e-maile do klientów za pomocą JS + G Sheets

Generuj artykuły bazy wiedzy za pomocą GPT i Perplexity AI dla Contentful CMS

Wyodrębnij i zapisz dane faktur z Dysku Google do Arkuszy za pomocą Dumpling AI

Wyślij motywujące cytaty do Slacka

Spersonalizowany biuletyn technologii AI wykorzystujący RSS, OpenAI i Gmail

Generator obrazów AI z tekstu zbudowany na fal.ai

GPT-4o – Dumpling AI i agent LangChain - GPT-4o – Dumpling AI i agent LangChain

1 2 3 … 30 Następne »

Skopiuj kod szablonu

{"id":"","meta":{"instanceId":"","templateCredsSetupCompleted":true},"name":"Generate AI-Ready llms.txt Files from Screaming Frog Website Crawls","tags":[],"nodes":[{"id":"ca701618-b2d5-48ee-a503-d3513d018a65","name":"Sticky Note","type":"n8n-nodes-base.stickyNote","position":[360,-500],"parameters":{"color":7,"width":360,"height":860,"content":"## Form - Screaming Frog internal_html.csv upload nnThis form node is used to trigger the workflow. nnIt contains **three input fields**: n- Name of the website n- Short description of the website n- **Screaming Frog** export containing the internal URLs nnnnIt is recommended to use the **internal_html.csv** export, but **internal_all.csv** will also work, as the workflow includes a filter to process only indexable URLs.n"},"typeVersion":1},{"id":"bc040ca0-f38d-4458-a60c-17f71dbfd1ea","name":"Sticky Note1","type":"n8n-nodes-base.stickyNote","position":[780,-500],"parameters":{"color":7,"width":360,"height":860,"content":"## Extract data from Screaming Frog filennThis node extracts data from the **CSV file** provided by the user. nnIt produces an output that is **easily usable** in the following nodes. nn⚠️ **Caution:** nIf the uploaded file is **not** the expected Screaming Frog export, the workflow will still proceed but will likely **fail in the next steps** due to missing required fields. nn"},"typeVersion":1},{"id":"f71a7d10-847d-48e7-8820-ec0c1e7ea055","name":"Sticky Note2","type":"n8n-nodes-base.stickyNote","position":[1200,-500],"parameters":{"color":7,"width":360,"height":860,"content":"## Set Useful Fields nnThis node sets **7 key fields** from the Screaming Frog export: nn- `url` → from the **"Address"** column n- `title` → from the **"Title 1"** column n- `description` → from the **"Meta Description 1"** column n- `status` → from the **"Status Code"** column n- `indexability` → from the **"Indexability"** column n- `content_type` → from the **"Content Type"** column n- `word_count` → from the **"Word Count"** column nnn**Multi-language compatibility** nIf you're using Screaming Frog in **French, Italian, German, or Spanish**, the column names will be different. nHowever, the workflow is designed to handle this, so it will **still work correctly**! 🥳n"},"typeVersion":1},{"id":"6f6546b8-adeb-4998-ae19-d93525337eb7","name":"Set useful fields","type":"n8n-nodes-base.set","position":[1340,60],"parameters":{"options":{},"assignments":{"assignments":[{"id":"0e7d4a06-83fc-4834-93fe-2e758cbe2307","name":"url","type":"string","value":"={{ $json.Address || $json.Adresse || $json.Dirección || $json.Indirizzo }}"},{"id":"c82f4d4c-9d0b-4c7d-9647-5d0240b58643","name":"title","type":"string","value":"={{ $json['Title 1'] || $json['Titolo 1'] || $json['Titolo 1'] || $json['Título 1'] || $json['Titel 1'] }}"},{"id":"abea81db-ce3b-4ac1-bd21-09ccfffb567a","name":"description","type":"string","value":"={{ $json['Meta Description 1'] || $json['Meta description 1'] }}"},{"id":"2ca75d74-70f8-400b-b862-9da186135915","name":"statut","type":"string","value":"={{ $json['Status Code'] || $json['Code HTTP'] || $json['Status-Code'] || $json['Código de respuesta'] || $json['Codice di stato']}}"},{"id":"754d3202-38b0-4d79-ba24-8078b3244307","name":"indexability","type":"string","value":"={{ $json.Indexability || $json.Indexabilité || $json.Indicizzabilità || $json.Indexabilidad || $json.Indexierbarkeit}}"},{"id":"8bc6583d-bb34-4d22-b310-fe79bb8ac85d","name":"content_type","type":"string","value":"={{ $json['Content Type'] || $json['Type de contenu'] || $json['Tipo di contenuto'] || $json['Tipo de contenido'] || $json['Inhaltstyp']}}"},{"id":"c874ba1a-769e-43d3-9555-8c9914ca9b76","name":"word_count","type":"string","value":"={{ $json['Word Count'] || $json['Nombre de mots'] || $json['Conteggio delle parole'] || $json['Conteggio delle parole'] || $json['Recuento de palabras'] || $json['Wortanzahl'] }}"}]}},"typeVersion":3.4},{"id":"1a9af7a0-d2d5-44cb-9770-2d5a1e5706f4","name":"Text Classifier","type":"@n8n/n8n-nodes-langchain.textClassifier","disabled":true,"position":[2260,60],"parameters":{"options":{},"inputText":"=url : {{ $json.url }}ntitle : {{ $json.title }}ndescription : {{ $json.description }}nwords count : {{ $json.word_count }}","categories":{"categories":[{"category":"useful_content","description":"Pages that are likely to contain high-quality content, making them suitable for inclusion in a file that aids content discovery for an LLM. "},{"category":"other_content","description":"Pages that should not be included (e.g., pagination, or low-value content)."}]}},"typeVersion":1},{"id":"74a4e378-4228-4142-92ca-e541efde2b15","name":"OpenAI Chat Model","type":"@n8n/n8n-nodes-langchain.lmChatOpenAi","position":[2180,240],"parameters":{"model":{"__rl":true,"mode":"list","value":"gpt-4o-mini"},"options":{}},"credentials":{"openAiApi":{"id":"","name":"OpenAi Connection"}},"typeVersion":1.2},{"id":"63dc6cfe-bc73-43b5-8c7d-4f5fd6501d3b","name":"No Operation, do nothing","type":"n8n-nodes-base.noOp","position":[2580,200],"parameters":{},"typeVersion":1},{"id":"cb555b99-9e63-4b6b-a1fc-512b5467d666","name":"Sticky Note3","type":"n8n-nodes-base.stickyNote","position":[1620,-500],"parameters":{"color":7,"width":360,"height":860,"content":"## Filter URLs nnThis **filter node** is used to keep only the URLs that meet the following conditions: n- `status` = **200** n- `indexability` = **indexable** n- `content_type` contains **text/html** nnnThese filters are even **more useful** if the uploaded file is an **internal_all.csv** instead of an **internal_html.csv**. nn### **Tips:** nYou can **add more filters** to refine the URLs included in your `llms.txt` file. nn💡 **Examples:** n- **Filter by word count** → Ensure pages contain **enough text content**. n- **Filter by URL path** → Keep only **specific folders or categories** in the `llms.txt` file. n- **Filter by meta description** → Exclude URLs **without a meta description**, as this field will be used in the `llms.txt` file to describe each piece of content. n"},"typeVersion":1},{"id":"e34e56e2-5cc8-4e50-bfb0-3aa2e5e04abf","name":"Filter URLs","type":"n8n-nodes-base.filter","position":[1740,60],"parameters":{"options":{},"conditions":{"options":{"version":2,"leftValue":"","caseSensitive":true,"typeValidation":"strict"},"combinator":"and","conditions":[{"id":"cef4feaa-1c46-45b1-92b7-f5c2051b1dc5","operator":{"type":"number","operation":"equals"},"leftValue":"={{ Number($json.statut) }}","rightValue":200},{"id":"bb821656-9740-4da4-8aa9-f65ad098c470","operator":{"type":"boolean","operation":"true","singleValue":true},"leftValue":"={{ ["Indexable", "Indicizzabile", "Indexierbar"].includes($json.indexability) }}","rightValue":"={{ "Indexable" || "Indicizzabile" }}"},{"id":"5c93ddb8-8091-406a-bc04-fa14e8b73fb9","operator":{"type":"string","operation":"contains"},"leftValue":"={{ $json.content_type }}","rightValue":"text/html"}]}},"typeVersion":2.2},{"id":"b98f19a8-afd3-4d26-8063-dee3ee75055f","name":"Sticky Note4","type":"n8n-nodes-base.stickyNote","position":[2040,-800],"parameters":{"color":2,"width":740,"height":1160,"content":"## Text Classifiernn🚫 **This node is deactivated by default** in the template. nnYou can **enable it** if you want to add a more **"intelligent" 🤓 filter** to refine the URLs included in the `llms.txt` file, helping LLMs discover and prioritize valuable content.nn### How It Works:nThis node has **two outputs**: n- **`useful_content`** → Pages that are **likely to contain high-quality content**, making them suitable for inclusion in a file that **aids content discovery for an LLM**. n- **`other_content`** → Pages that should **not** be included (e.g., pagination or low-value content). nnnYou can **modify the description** in the node to fine-tune the classification according to your needs. nn### Input Fields:n- **url** → `{{ $json.url }}` n- **title** → `{{ $json.title }}` n- **description** → `{{ $json.description }}` n- **word_count** → `{{ $json.word_count }}` nn### Why use an LLM? nA **language model (LLM)** can **analyze** the **URL, title, and description** to identify pages that **most likely contain meaningful and relevant content**. nThis allows it to **prioritize valuable pages** and structure the data for **better content discovery and training purposes**. nn### **For large websites** nIf you have a **very large website**, consider using a **Loop Over Items** node to make the workflow **more robust** and ensure all pages are processed. nAlso, using a **Loop Over Items** node make it **easier** to handle: n- **Timeouts** n- **API quotas** n- **Other scalability issues**nn### Tokens usagenFinally, keep in mind that **more pages mean more tokens and more billed LLM API calls**.nnnnnnnn"},"typeVersion":1},{"id":"63e3ea7a-cec3-442c-9812-771def0a9949","name":"Sticky Note5","type":"n8n-nodes-base.stickyNote","position":[2840,-500],"parameters":{"color":7,"width":360,"height":860,"content":"## Set Field - llms.txt RownnThis node **sets** the row format for the `llms.txt` file. nn### Row Structure:nEach row follows this format: nn- `- [title](link): description` nnIf the URL **has no description** (from the **Meta Description** in the Screaming Frog export), the row will be: nn- `- [title](link)` n"},"typeVersion":1},{"id":"78f58220-feb5-4044-b994-39a0e4f1e9e4","name":"Sticky Note6","type":"n8n-nodes-base.stickyNote","position":[3260,-500],"parameters":{"color":7,"width":360,"height":860,"content":"## Summarize - ConcatenatennThis node concatenates all the output from the previous node, ensuring each row is on a separate line."},"typeVersion":1},{"id":"7a119633-7cd3-4de5-a1cd-7f708e1abf4a","name":"Sticky Note7","type":"n8n-nodes-base.stickyNote","position":[3680,-500],"parameters":{"color":7,"width":360,"height":860,"content":"## Set Fields - llms.txt ContentnnThis node sets the content of the `llms.txt` file using:nn- The **website title** provided in the form (first node).n- The **website description** provided in the form (first node).n- The output from the previous node, which includes all the URLs, their titles, and their descriptions that will appear in the `llms.txt` file.n"},"typeVersion":1},{"id":"554f6858-68e8-4b35-a6c4-21bed6832323","name":"Sticky Note8","type":"n8n-nodes-base.stickyNote","position":[4100,-500],"parameters":{"color":7,"width":360,"height":860,"content":"## Generate llms.txt filennThis node **creates** the `llms.txt` file, which can be **downloaded directly** within n8n. n"},"typeVersion":1},{"id":"24bdefba-e2f2-41f0-93e7-9f8d2fc11f43","name":"Sticky Note9","type":"n8n-nodes-base.stickyNote","position":[4520,-500],"parameters":{"color":7,"width":360,"height":860,"content":"## upload file anywherennInstead of downloading the file directly from the n8n workflow, you can **replace this node node** with a Drive node (e.g., **Google Drive** or **OneDrive**) to upload the `llms.txt` file to a folder of your choice. n n**Name the file properly** (e.g., include the website name) to make it easier to find and distinguish between files when working on multiple websites. n"},"typeVersion":1},{"id":"a3be51e3-810c-40a7-a996-98a3d383c2b9","name":"Summarize - Concatenate","type":"n8n-nodes-base.summarize","position":[3380,40],"parameters":{"options":{},"fieldsToSummarize":{"values":[{"field":"llmTxtRow","separateBy":"n","aggregation":"concatenate"}]}},"typeVersion":1.1},{"id":"8d3a892a-3d11-4d8a-8ec6-84f8f3af1183","name":"Set Fields - llms.txt Content","type":"n8n-nodes-base.set","position":[3820,40],"parameters":{"options":{},"assignments":{"assignments":[{"id":"97062a99-e944-4e1e-89b1-62cf9e3462dd","name":"llmsTxtFile","type":"string","value":"=# {{ $('Form - Screaming frog internal_html.csv upload').item.json['What is the name of your website?'] }}n> {{ $('Form - Screaming frog internal_html.csv upload').item.json['Can you provide a short description of your website? (in the language of the website)'] }}nn{{ $json.concatenated_llmTxtRow }}"}]}},"typeVersion":3.4},{"id":"bc2a692a-47ea-4bf1-a102-e607fd544158","name":"upload file anywhere","type":"n8n-nodes-base.noOp","position":[4640,40],"parameters":{},"typeVersion":1},{"id":"404510a2-35b2-44cf-9d02-eb0abcf4e9b3","name":"Set Field - llms.txt Row","type":"n8n-nodes-base.set","position":[2960,40],"parameters":{"options":{},"assignments":{"assignments":[{"id":"95e75caa-8110-476b-9cb1-73c15361fa56","name":"llmTxtRow","type":"string","value":"=- [{{ $json.title }}]({{ $json.url }}){{ $json.description ? ': ' + $json.description : '' }}"}]}},"typeVersion":3.4},{"id":"f54d51f2-17bc-4c58-b177-0e77e16f7b72","name":"Sticky Note10","type":"n8n-nodes-base.stickyNote","position":[-420,-1020],"parameters":{"color":5,"width":700,"height":1380,"content":"# Generate AI-Ready llms.txt Files from Screaming Frog Website Crawls nnThis workflow helps you generate an **llms.txt** file (if you're unfamiliar with it, check out [this article](https://towardsdatascience.com/llms-txt-414d5121bcb3/)) using a **Screaming Frog export**. nn[Screaming Frog](https://www.screamingfrog.co.uk/seo-spider/) is a well-known website crawler. nYou can easily crawl a website. Then, export the **"internal_html"** section in CSV format. nn## How It Works: nnA **form** allows you to enter: n- The **name of the website** n- A **short description** n- The **internal_html.csv** file from your Screaming Frog export nnnOnce the form is submitted, the **workflow is triggered automatically**, and you can **download the llms.txt file directly from n8n**. nn## Downloading the FilenSince the last node in this workflow is **"Convert to File"**, you will need to **download the file directly from the n8n UI**. nHowever, you can easily **add a node** (e.g., Google Drive, OneDrive) to automatically upload the file **wherever you want**. nn## AI-Powered Filtering (Optional): nThis workflow includes a **text classifier node**, which is **deactivated by default**. n- You can **activate it** to apply a more **intelligent filter** to select URLs for the `llms.txt` file. n- Consider modifying the **description** in the classifier node to specify the type of URLs you want to include. nn## How to Use This Workflow nn1. **Crawl the website** you want to generate an `llms.txt` file for using **Screaming Frog**. n2. **Export the "internal_html"** section in CSV format. n ![Screaming Frog internal html export](https://i.imgur.com/M0nJQiV.png) n3. In **n8n**, click **"Test Workflow"**, fill in the form, and **upload** the `internal_html.csv` file. n4. Once the workflow is complete, go to the **"Export to File"** node and **download the output**. nn**That's it! You now have your llms.txt file!** nnnn**Recommended Usage:** nUse this workflow **directly in the n8n UI by clicking** 'Test Workflow' and uploading the file in the form."},"typeVersion":1},{"id":"e33104af-802a-43f2-b26d-f368f7de2fd7","name":"Form - Screaming frog internal_html.csv upload","type":"n8n-nodes-base.formTrigger","position":[460,60],"webhookId":"8791f39a-3d81-405c-b177-0a733ebf74cb","parameters":{"options":{"buttonLabel":"Get the llms.txt file"},"formTitle":"llms.txt Generator - From Screaming Frog export","formFields":{"values":[{"fieldLabel":"What is the name of your website?","placeholder":"Example : The best website ever","requiredField":true},{"fieldLabel":"Can you provide a short description of your website? (in the language of the website)","placeholder":"Example : This is the best website ever because all the content is engaging and valuable.","requiredField":true},{"fieldType":"file","fieldLabel":"screaming_frog_export","multipleFiles":false,"requiredField":true,"acceptFileTypes":".csv"}]},"responseMode":"lastNode","formDescription":"Generate a simple llms.txt file from a Screaming Frog ExportnIt is recommended to use the internal_html.csv export, although internal_all.csv will also work.nnFill in the fields in this form.Just fill in the fields in this form 😄"},"typeVersion":2.2},{"id":"f6b17fdd-a098-411e-8d53-3f6e638cc3ba","name":"Extract data from Screaming Frog file","type":"n8n-nodes-base.extractFromFile","position":[900,60],"parameters":{"options":{},"operation":"xls","binaryPropertyName":"screaming_frog_export"},"typeVersion":1},{"id":"6bbd8d1f-3322-4c6d-af08-c842386239ce","name":"Generate llms.txt file","type":"n8n-nodes-base.convertToFile","position":[4220,40],"parameters":{"options":{"encoding":"utf8","fileName":"llms.txt"},"operation":"toText","sourceProperty":"llmsTxtFile"},"typeVersion":1.1}],"active":false,"pinData":{},"settings":{"executionOrder":"v1"},"versionId":"","connections":{"Filter URLs":{"main":[[{"node":"Text Classifier","type":"main","index":0}]]},"Text Classifier":{"main":[[{"node":"Set Field - llms.txt Row","type":"main","index":0}],[{"node":"No Operation, do nothing","type":"main","index":0}]]},"OpenAI Chat Model":{"ai_languageModel":[[{"node":"Text Classifier","type":"ai_languageModel","index":0}]]},"Set useful fields":{"main":[[{"node":"Filter URLs","type":"main","index":0}]]},"Generate llms.txt file":{"main":[[]]},"Summarize - Concatenate":{"main":[[{"node":"Set Fields - llms.txt Content","type":"main","index":0}]]},"Set Field - llms.txt Row":{"main":[[{"node":"Summarize - Concatenate","type":"main","index":0}]]},"Set Fields - llms.txt Content":{"main":[[{"node":"Generate llms.txt file","type":"main","index":0}]]},"Extract data from Screaming Frog file":{"main":[[{"node":"Set useful fields","type":"main","index":0}]]},"Form - Screaming frog internal_html.csv upload":{"main":[[{"node":"Extract data from Screaming Frog file","type":"main","index":0}]]}}}

Generuj pliki llms.txt gotowe na sztuczną inteligencję z indeksowania witryn Screaming Frog

Opis workflow automatyzacji generowania pliku llms.txt

Jak to działa

Pobieranie pliku

Filtrowanie z wykorzystaniem AI (opcjonalne)

Instrukcja użycia

Przykłady zastosowań

Zalecane użycie

Bądź na bieżąco z AI

Inne możliwości automatyzacji

Pozostałe

AI w biznesie

OpenAI

AI w analizie danych

AI w automatyzacji

Etyka AI

AI i praca zdalna

Sztuczna inteligencja w marketingu

Agent AI

AI w zdrowiu i medycynie

AI on-device

AI w cyberbezpieczeństwie

Generowanie treści AI

Microsoft AI

AI w nauce

AI w automatyzacji biura

AI w finansach

Chatboty

AI w robotyce

Google DeepMind