Darmowy szablon automatyzacji

5 sposobów na przetwarzanie obrazów i plików PDF za pomocą Gemini AI w n8n

14973
25 dni temu
28
bloków


Jak to działa

Wielu użytkowników pytało na forum wsparcia o różne metody analizowania obrazów i dokumentów PDF za pomocą Google Gemini AI w n8n. Ten szablon automatyzacji odpowiada na to pytanie, demonstrując pięć różnych podejść:

  • Pojedynczy obraz z automatycznym przekazywaniem binarnym - Najprostsze podejście wykorzystujące automatyczną obsługę danych binarnych przez AI Agent
  • Wiele obrazów ze wstępnie zdefiniowanymi promptami - Do spersonalizowanej analizy z różnymi instrukcjami dla każdego obrazu
  • Natywne przetwarzanie element po elemencie w n8n - Do obsługi wielu elementów przy użyciu standardowego paradygmatu workflow n8n
  • Analiza PDF przez bezpośrednie API - Do analizy dokumentów i ekstrakcji tekstu
  • Analiza obrazów przez bezpośrednie API - Do bezpośredniej kontroli nad parametrami API

Każda metoda ma swoje zalety w zależności od konkretnego przypadku użycia, ilości danych i potrzeb personalizacji.

Kroki konfiguracji

Czas konfiguracji: ~5-10 minut

Będziesz potrzebować:

  • Klucz API Google Gemini
  • n8n z węzłami HTTP Request i AI Agent

Ważne: Dla węzłów HTTP Request wykonujących bezpośrednie wywołania API do Gemini (Metody 3, 4 i 5), musisz skonfigurować Query Authentication z kluczem API Gemini. Dodaj parametr o nazwie "key" z wartością Twojego klucza API w sekcji Query Auth tych węzłów.

Zaktualizuję ten szablon, jeśli znajdę lepsze sposoby. Daj mi znać, jeśli znasz inne metody. Zawsze chętny do nauki 🙂

Przykłady zastosowań

Ta automatyzacja może być wykorzystana w wielu scenariuszach biznesowych i technicznych. Oto kilka potencjalnych zastosowań:

  • Automatyczna analiza zdjęć produktów w e-commerce pod kątem zgodności z wytycznymi
  • Ekstrakcja tekstu z faktur i dokumentów księgowych w formacie PDF
  • Moderacja treści graficznych na platformach społecznościowych
  • Automatyczne tagowanie i kategoryzacja zdjęć w systemach DAM
  • Analiza dokumentów prawnych i umów pod kątem kluczowych klauzul
  • Przetwarzanie zdjęć z formularzy zgłoszeniowych i aplikacyjnych
  • Automatyczne generowanie opisów alt text dla obrazów na stronach internetowych


   Skopiuj kod szablonu   
{"meta":{"instanceId":"d4d7965840e96e50a3e02959a8487c692901dfa8d5cc294134442c67ce1622d3","templateCredsSetupCompleted":true},"nodes":[{"id":"eec7d9b8-d1e3-4a43-9e0d-f6d750e736b5","name":"When clicking ‘Test workflow’","type":"n8n-nodes-base.manualTrigger","position":[-640,-400],"parameters":{},"typeVersion":1},{"id":"5276a2cf-3d42-409a-800d-9080aa5e1a09","name":"Google Gemini Chat Model","type":"@n8n/n8n-nodes-langchain.lmChatGoogleGemini","position":[1820,-60],"parameters":{"options":{},"modelName":"models/gemini-2.0-flash"},"credentials":{"googlePalmApi":{"id":"BB5B0v4OaFQeEt3C","name":"Google Gemini(PaLM) Api account"}},"typeVersion":1},{"id":"1b89dca9-1137-4e0f-b3ff-1b354152c128","name":"AI Agent","type":"@n8n/n8n-nodes-langchain.agent","position":[1880,-260],"parameters":{"text":"={{ $('Loop Over Items').all() }}","options":{"passthroughBinaryImages":true},"promptType":"define"},"typeVersion":1.7},{"id":"00348203-882f-48da-8127-e57cf30c5b20","name":"Get image from unsplash2","type":"n8n-nodes-base.httpRequest","position":[1160,-280],"parameters":{"url":"={{ $json.url }}","options":{}},"typeVersion":4.2},{"id":"1b07777f-954b-4471-ab4b-070c902c0bc1","name":"Split Out","type":"n8n-nodes-base.splitOut","position":[500,-280],"parameters":{"options":{},"fieldToSplitOut":"urls"},"typeVersion":1},{"id":"3646a695-d63b-4e25-93b4-a208592e6eac","name":"Get image from unsplash3","type":"n8n-nodes-base.httpRequest","position":[720,0],"parameters":{"url":"={{ $json.urls }}","options":{}},"typeVersion":4.2},{"id":"34ef745b-23c6-422d-9367-de79eeb54e77","name":"Transform to base","type":"n8n-nodes-base.extractFromFile","position":[940,0],"parameters":{"options":{},"operation":"binaryToPropery"},"typeVersion":1},{"id":"762507d2-2093-4ac8-a4d4-2972c53fa839","name":"Call Gemini API1","type":"n8n-nodes-base.httpRequest","position":[1160,0],"parameters":{"url":"=https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent","method":"POST","options":{},"jsonBody":"={n "contents": [{n "parts":[n {"text": "Whats on this image?"},n {n "inline_data": {n "mime_type": "image/jpeg",n "data": "{{ $json.data }}"n }n }n ]n }]n}n","sendBody":true,"specifyBody":"json","authentication":"genericCredentialType","genericAuthType":"httpQueryAuth"},"credentials":{"httpQueryAuth":{"id":"Eh1GI1UjOtJk4CDZ","name":"Query Gemini Auth account"}},"typeVersion":4.2},{"id":"0dfa7ae9-1eda-49ca-8067-c467346c27cb","name":"Loop Over Items","type":"n8n-nodes-base.splitInBatches","position":[1560,-280],"parameters":{"options":{}},"typeVersion":3},{"id":"c5562d6b-0b56-4f15-bbd0-441359f89d86","name":"AI Agent2","type":"@n8n/n8n-nodes-langchain.agent","position":[1560,-700],"parameters":{"text":"whats on the image","options":{"passthroughBinaryImages":true},"promptType":"define"},"typeVersion":1.7},{"id":"277c44ec-109b-4dfa-bc04-defec26e6581","name":"Google Gemini Chat Model1","type":"@n8n/n8n-nodes-langchain.lmChatGoogleGemini","position":[1500,-540],"parameters":{"options":{},"modelName":"models/gemini-2.0-flash"},"credentials":{"googlePalmApi":{"id":"BB5B0v4OaFQeEt3C","name":"Google Gemini(PaLM) Api account"}},"typeVersion":1},{"id":"c303ab4e-155f-4c36-bf07-4825d0d1fd93","name":"Get image from unsplash4","type":"n8n-nodes-base.httpRequest","position":[1160,-700],"parameters":{"url":"=https://plus.unsplash.com/premium_photo-1740023685108-a12c27170d51?q=80&w=2340&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D","options":{}},"typeVersion":4.2},{"id":"b786ac03-75d0-4830-849f-ee9ed8e108fa","name":"Get PDF file","type":"n8n-nodes-base.httpRequest","position":[260,360],"parameters":{"url":"https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf","options":{}},"typeVersion":4.2},{"id":"bb7ba59e-050a-4259-8238-dc25f458e3c4","name":"Get image from unsplash","type":"n8n-nodes-base.httpRequest","position":[260,660],"parameters":{"url":"=https://plus.unsplash.com/premium_photo-1740023685108-a12c27170d51?q=80&w=2340&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D","options":{}},"typeVersion":4.2},{"id":"65a4cb42-632b-4fbd-8e28-10e36e9f1e00","name":"Call Gemini API with PDF","type":"n8n-nodes-base.httpRequest","position":[720,360],"parameters":{"url":"=https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent","method":"POST","options":{},"jsonBody":"={n "contents": [{n "parts":[n {"text": "Whats on this pdf?"},n {n "inline_data": {n "mime_type": "application/pdf",n "data": "{{ $json.data }}"n }n }n ]n }]n}n","sendBody":true,"specifyBody":"json","authentication":"genericCredentialType","genericAuthType":"httpQueryAuth"},"credentials":{"httpQueryAuth":{"id":"Eh1GI1UjOtJk4CDZ","name":"Query Gemini Auth account"}},"typeVersion":4.2},{"id":"75a13a82-c051-449a-bf52-837256c18f22","name":"Call Gemini API with Image","type":"n8n-nodes-base.httpRequest","position":[720,660],"parameters":{"url":"=https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent","method":"POST","options":{},"jsonBody":"={n "contents": [{n "parts":[n {"text": "Whats on this image?"},n {n "inline_data": {n "mime_type": "image/jpeg",n "data": "{{ $json.data }}"n }n }n ]n }]n}n","sendBody":true,"specifyBody":"json","authentication":"genericCredentialType","genericAuthType":"httpQueryAuth"},"credentials":{"httpQueryAuth":{"id":"Eh1GI1UjOtJk4CDZ","name":"Query Gemini Auth account"}},"typeVersion":4.2},{"id":"55964e8d-ce0a-4157-b536-41862da946ab","name":"Transform to base64 (image)","type":"n8n-nodes-base.extractFromFile","position":[500,660],"parameters":{"options":{},"operation":"binaryToPropery"},"typeVersion":1},{"id":"57fc92ee-2a74-4d49-ae7a-6c499a1f380e","name":"Transform to base64 (pdf)","type":"n8n-nodes-base.extractFromFile","position":[500,360],"parameters":{"options":{},"operation":"binaryToPropery"},"typeVersion":1},{"id":"dc0a5515-1a51-4a11-9b39-ac8b30bcb0ba","name":"Define Multiple Image URLs","type":"n8n-nodes-base.set","position":[260,0],"parameters":{"options":{},"assignments":{"assignments":[{"id":"95f15b6e-f66a-450a-be19-75d4c339f943","name":"urls","type":"array","value":"=[n "https://plus.unsplash.com/premium_photo-1740023685108-a12c27170d51?q=80&w=2340&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D",n "https://images.unsplash.com/photo-1739609579483-00b49437cc45?q=80&w=2342&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D"n]n"}]}},"typeVersion":3.4},{"id":"4f9037a3-adc5-4ada-b977-6ecdf6f58705","name":"Split Out to multiple items","type":"n8n-nodes-base.splitOut","position":[500,0],"parameters":{"options":{},"fieldToSplitOut":"urls"},"typeVersion":1},{"id":"0e0bbf58-4c83-4769-9616-c120296ce5e0","name":"Sticky Note","type":"n8n-nodes-base.stickyNote","position":[-760,-780],"parameters":{"color":5,"width":440,"height":340,"content":"## When clicking "Test workflow"nnThis trigger demonstrates five different approaches to analyze media with AI:n1. Top branch: Single image with automatic binary passthroughn2. Second branch: Multiple images with custom promptsn3. Third branch: Standard n8n item processing with direct APIn4. Fourth branch: PDF analysis via direct APIn5. Fifth branch: Image analysis via direct APInnEach approach has advantages depending on your use case.nn"},"typeVersion":1},{"id":"2a6236b2-c5a3-4feb-883a-b3654ce78278","name":"Define URLs And Prompts","type":"n8n-nodes-base.set","position":[260,-280],"parameters":{"options":{},"assignments":{"assignments":[{"id":"95f15b6e-f66a-450a-be19-75d4c339f943","name":"urls","type":"array","value":"={{ n[n {n url: "https://plus.unsplash.com/premium_photo-1740023685108-a12c27170d51?q=80&w=2340&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D",n prompt: "what is special about this image?",n process: truen },n {n url: "https://images.unsplash.com/photo-1739609579483-00b49437cc45?q=80&w=2342&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D",n prompt: "what is the main color?",n process: truen },n {n url: "https://plus.unsplash.com/premium_photo-1740023685108-a12c27170d51?q=80&w=2340&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D",n prompt: "test", n process: falsen }n]n}}"}]}},"typeVersion":3.4},{"id":"91dd2eec-3179-4c14-857e-bb65499723be","name":"Sticky Note1","type":"n8n-nodes-base.stickyNote","position":[1920,-700],"parameters":{"color":4,"width":440,"height":300,"content":"## METHOD 1: Single image with automatic binary passthroughnnThis branch demonstrates the easiest way to analyze a single image with AI:n1. Fetch an image from Unsplashn2. Send directly to the AI Agent with "Automatically Passthrough Binary Images" enabledn3. Get AI analysis without any data transformationnnBEST FOR: Quick implementation with minimal configuration for single image analysis.nn"},"typeVersion":1},{"id":"2b48f2d0-8c55-40da-acf5-0d9267691817","name":"Filter (optional)","type":"n8n-nodes-base.filter","position":[720,-280],"parameters":{"options":{},"conditions":{"options":{"version":2,"leftValue":"","caseSensitive":true,"typeValidation":"strict"},"combinator":"and","conditions":[{"id":"51b55272-94af-4761-a42e-5c91f3b8e39e","operator":{"type":"boolean","operation":"true","singleValue":true},"leftValue":"={{ $json.process }}","rightValue":""}]}},"typeVersion":2.2},{"id":"f9f98d03-99c6-4424-b7cc-fd2ef836173b","name":"Sticky Note2","type":"n8n-nodes-base.stickyNote","position":[2220,-260],"parameters":{"color":3,"width":460,"height":360,"content":"## METHOD 2: Multiple images with custom promptsnnThis branch shows how to analyze different images with custom instructions:n1. Prepare data structure with image URLs and their corresponding promptsn2. Split into individual items and filter if neededn3. Fetch each image from Unsplashn4. Process sequentially through the Loop noden5. Analyze each with its specific prompt using the AI AgentnnBEST FOR: When you need different analysis goals for each image (e.g., one for object detection, another for scene description).n"},"typeVersion":1},{"id":"057ecbfa-3079-451f-9376-eefcdf4ab96a","name":"Sticky Note3","type":"n8n-nodes-base.stickyNote","position":[1360,0],"parameters":{"color":7,"width":360,"height":380,"content":"## METHOD 3: Standard n8n item processing with direct APInnThis branch demonstrates n8n's standard approach to handling multiple items:n1. Define multiple image URLs in a single noden2. Split into individual items for processingn3. Fetch each image individuallyn4. Transform each to base64 formatn5. Make direct API calls to Gemini for each itemnnBEST FOR: Processing multiple images using n8n's standard item-by-item approach with direct API control.nn"},"typeVersion":1},{"id":"3dfbde27-7141-4558-a958-00b2891274ec","name":"Sticky Note4","type":"n8n-nodes-base.stickyNote","position":[920,360],"parameters":{"width":340,"height":280,"content":"## METHOD 4: PDF analysis via direct APInnThis branch shows how to analyze PDF documents:n1. Fetch a PDF filen2. Transform to base64 formatn3. Send directly to Gemini API for analysisnnBEST FOR: Document analysis, text extraction, summarization of PDFs.n"},"typeVersion":1},{"id":"f50dae19-77e4-4450-a516-7f0e676d161a","name":"Sticky Note5","type":"n8n-nodes-base.stickyNote","position":[920,660],"parameters":{"width":340,"height":300,"content":"## METHOD 5: Image analysis via direct APInnThis branch demonstrates direct API control for image analysis:n1. Fetch an imagen2. Transform to base64 formatn3. Make a customized API call to GemininnBEST FOR: Advanced users who need precise control over API parameters and response handling.n"},"typeVersion":1}],"pinData":{},"connections":{"AI Agent":{"main":[[{"node":"Loop Over Items","type":"main","index":0}]]},"Split Out":{"main":[[{"node":"Filter (optional)","type":"main","index":0}]]},"Get PDF file":{"main":[[{"node":"Transform to base64 (pdf)","type":"main","index":0}]]},"Loop Over Items":{"main":[[],[{"node":"AI Agent","type":"main","index":0}]]},"Call Gemini API1":{"main":[[]]},"Filter (optional)":{"main":[[{"node":"Get image from unsplash2","type":"main","index":0}]]},"Transform to base":{"main":[[{"node":"Call Gemini API1","type":"main","index":0}]]},"Define URLs And Prompts":{"main":[[{"node":"Split Out","type":"main","index":0}]]},"Get image from unsplash":{"main":[[{"node":"Transform to base64 (image)","type":"main","index":0}]]},"Get image from unsplash2":{"main":[[{"node":"Loop Over Items","type":"main","index":0}]]},"Get image from unsplash3":{"main":[[{"node":"Transform to base","type":"main","index":0}]]},"Get image from unsplash4":{"main":[[{"node":"AI Agent2","type":"main","index":0}]]},"Google Gemini Chat Model":{"ai_languageModel":[[{"node":"AI Agent","type":"ai_languageModel","index":0}]]},"Google Gemini Chat Model1":{"ai_languageModel":[[{"node":"AI Agent2","type":"ai_languageModel","index":0}]]},"Transform to base64 (pdf)":{"main":[[{"node":"Call Gemini API with PDF","type":"main","index":0}]]},"Define Multiple Image URLs":{"main":[[{"node":"Split Out to multiple items","type":"main","index":0}]]},"Split Out to multiple items":{"main":[[{"node":"Get image from unsplash3","type":"main","index":0}]]},"Transform to base64 (image)":{"main":[[{"node":"Call Gemini API with Image","type":"main","index":0}]]},"When clicking ‘Test workflow’":{"main":[[{"node":"Define URLs And Prompts","type":"main","index":0},{"node":"Define Multiple Image URLs","type":"main","index":0},{"node":"Get PDF file","type":"main","index":0},{"node":"Get image from unsplash","type":"main","index":0},{"node":"Get image from unsplash4","type":"main","index":0}]]}}}
  • API
  • Request
  • URL
  • Build
  • cURL
  • LangChain
  • Chat
  • Conversational
  • Plan and Execute
  • ReAct
  • Tools
Planeta AI 2025 
magic-wandmenu linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram