본문 바로가기

프로그래밍/AI

[Search] Tavily API (*Domain Control*)

https://www.tavily.com/

 

Tavily - The Web Access Layer for AI Agents

Tavily is the real‑time search engine for AI agents and RAG workflows — Fast and secure APIs for web search and content extraction. Trusted by 600K+ developers.

www.tavily.com

 

Tavily는 AI 에이전트가 실시간으로 신뢰할 수 있는 사실 기반 정보를 검색, 요약, 정제하여 가져올 수 있도록 돕는 AI-Native 검색 라이브러리입니다. 단순히 URL을 나열하는 것이 아니라, LLM이 즉시 이해할 수 있는 형태의 데이터를 반환합니다.
Tavily
  • No HTML, No Noise : 웹 페이지의 복잡한 HTML 테그, 광고, 메뉴바를 제거하고 순수 텍스트(Clean Context) 만 추출합니다. 
  • Context Optimization : LLM의 토큰 제한을 고려하여, 질문에 가장 관련성이 높은 부분만 슬라이싱 하여 제공합니다. 
  • AI-Specific Ranking : SEO(검색 엔진 최적화) 점수가 아닌, 질문의 의도와 정보의 가치를 기준으로 결과와 순위를 재구성합니다. 
Important Parameter
  • Search Depth : basic(1API Credit) or advanced(2API Credit)
    • 정확도가 중요한 연구 과제하면 advanced 를 추천 
  • Include Answer : 검색 결과 기반 AI 답변 동시 생성 
    • 별도의 LLM 호출없이도 쿼리에 대한 요약 답변을 얻을 수 있습니다. (Tavily 자체 LLM) 
    • include_answer=True
  • Domain Control : 특정 도메인 포함/제외 필터링 
    • 기술 블로구(GitHub, StackOverflow) 등 신뢰할 수 있는 소스로 범위 제한
    • include_domain = target_domain # 리스트 제공
  • Search Topic : general 또는 news 지정 
    • 최신 뉴스나 시사 이슈 검색 시 news 토픽이 훨씬 정교한 결과를 반환 
    • topic = "general" 
    • 일반적 검색은 "general", 최신 정보에 민감하게 "news" and 금융 정보 위주"finance
How it works 
  • 실시간 집계 : 쿼리가 들어오면 수백 개의 고품질 웹 소스를 동시 탐색합니다. 
  • 데이터 정제 : 수집된 페이지에서 불필요한 요소를 제거하고 텍스트 데이터만 남깁니다. 
  • 검색 결과 최적화 : LLM이 컨텍스트로 사용하기 가장 적합한 길이와 순서로 데이터를 재가공합니다. 
  • JSON 응답 : 구조화된 데이터를 반환하여 개발자가 추가 파싱 없이 바로 시스템에 적용하게 됩니다. 
Expected Effect 
  • Hallucination 방지 : 검증된 실시간 데이터를 근거로 답변을 생성하므로 모델의 거짓 답변을 획기적으로 줄입니다. 
  • 비용 절감 : 불필요한 텍스트를 사전에 필터링하여 LLM API 호출 시 발생하는 토큰 비용을 최소화 합니다. 
  • 개발 생산성 : 크롤링, 파싱, 요약 로직을 직접 구현할 필요 없이 API 하나로 해결됩니다. 
Domain Control in TECHTREE Searvice (my projcet) 
# TECHTREE SEARVICE Acture Domain Set
# Domain Filtering Configuration
DOMAIN_MAP = {
    "tech_news": [  
        "news.hada.io",                 # GeekNews (High Quality Curated)
    ],
    "engineering": [
        "github.com",                   # Open Source
        "huggingface.co",               # AI Models & Papers
        "openai.com",                   # Official Blog
        "anthropic.com",                # Official Blog
        "langchain.com/blog",           # AI 앱 개발 메타
        "wandb.ai/fully-connected",     # 실무 엔지니어링
        "pytorch.org/blog",             # Official Framework Blog
    ],
    "research": [   
        "arxiv.org",                    # Research Papers
        "paperswithcode.com",           # Papers + Code
        "deepmind.google/research",     # DeepMind Research
        "research.google",              # Google Research
        "scholar.google.com",           # Google Scholar
        "openreview.net"                # Academic Reviews
    ],
    "k_blog": [      
        "techblog.woowahan.com",        # Woowa Bros
        "medium.com/daangn",            # Daangn Market
        "toss.tech",                    # Toss
        "devocean.sk.com",              # SK
        "helloworld.kurly.com",         # Kurly
        "techblog.lycorp.co.jp/ko",     # LINE
        "d2.naver.com",                 # Naver D2
        "kakaoenterprise.com",          # Kakao Enterprise
        "hyperconnect.com",             # Hyperconnect
        "ridicorp.com/story",           # Ridi
        "netmarble.engineering",        # Netmarble
    ]
}

 

API Example 
  • Query : What are the key differences between GPT-4 and Claude 2? 
  • Include answer = 'Advanced'
{
  "query": "What are the key differences between GPT-4 and Claude 2?",
  "response_time": 2.96,
  "follow_up_questions": null,
  "answer": "GPT‑4 and Claude 2 differ in several core areas: Claude 2 offers a substantially larger context window (around 100 K tokens, with the later Claude 2.1 extending to 200 K, versus GPT‑4’s original 8 K‑token limit and GPT‑4 Turbo’s 128 K limit) which lets it handle far longer documents, and it is generally faster to respond; however GPT‑4 (especially the Turbo and GPT‑4o variants) is multimodal, accepting images, audio and video in addition to text, while Claude 2 is text‑only. GPT‑4’s training data is more recent (cutoff April 2023 for Turbo and October 2023 for GPT‑4o) compared with Claude 2’s early‑2023 cutoff, giving it an edge on very recent events, whereas Claude 2 tends to outperform GPT‑4 on niche tasks such as legal reasoning, mathematics and specialized writing, often showing higher precision and safety tuning. In terms of overall capability, GPT‑4 remains the leader for broad conversational ability, general knowledge, coding and creative content generation, while Claude 2 excels in domain‑specific accuracy and speed for large‑scale text analysis. Pricing also varies, with Claude 2 priced at roughly 0.8 cents per 1 K input tokens and 2.4 cents per 1 K output tokens, whereas GPT‑4 Turbo’s rates are generally higher.",
  "images": [],
  "results": [
    {
      "url": "https://www.swiftask.ai/blog/this-claude-2-vs-gpt-4-comparison-helps-you-see-clearly",
      "title": "GPT-4 vs Claude 2: who will assist you better in 2024? - Swiftask",
      "content": "Claude 2 stands out for its exceptional ability to process documents containing up to 100,000 tokens, providing a significant scope for analyzing large texts. In comparison, GPT-4 has a more limited capacity, processing approximately 4,000 words per prompt due to its contextual limit of 8,192 tokens. This distinction marks a considerable difference in favor of Claude 2, positioning this artificial intelligence as the top choice for users with high demands for volume of processed content. [...] From the beginning of testing with Claude 2, the impression is striking: the speed of this model is incredible. Significantly faster than GPT-4, Claude 2 excels in accelerated response processing. The design of Claude 2 is clearly focused on delivering spectacular performance in instant text generation, providing a notable advantage for use cases that require real-time responses. This speed-optimized efficiency makes Claude 2 a preferred choice for those looking to integrate AI into processes [...] GPT-4 shows a better understanding of general information. In summary, Claude 2 surpasses GPT-4 by offering increased precision in niche areas such as law and mathematics. As for GPT-4, it remains the champion in general knowledge questions. These two AI models offer their own strengths, resulting in targeted expertise based on specific application domains.",
      "score": 0.9999778,
      "raw_content": null,
      "favicon": "https://www.swiftask.ai/favicon.png"
    },
    {
      "url": "https://www.akkio.com/post/gpt-4-turbo-vs-claude-2-1",
      "title": "GPT-4 Turbo vs Claude 2.1: Next-Gen AI Models - Akkio",
      "content": "GPT-4 Turbo has multimodal capabilities to process text, images, audio, video etc., while Claude 2.1 focuses solely on text processing. This makes GPT-4 better for creative applications.\n Claude 2.1 has a much larger context window of 200k tokens compared to GPT-4 Turbo's 128k tokens. This allows Claude to deeply analyze long documents.\n GPT-4 Turbo knowledge cutoff is April 2023, giving it an edge in comprehending very recent events over Claude 2.1's early 2023 cutoff. [...] Artificial intelligence (AI) has been advancing at an incredible pace recently. Two models leading this charge are GPT-4 Turbo from OpenAI and Claude 2.1 from Anthropic.\n\nBoth boast impressive capabilities, but they also have key differences in context window size, multimodal features, pricing, knowledge cutoff dates, performance attributes, and ideal use cases. [...] Claude 2.1 focuses solely on text, lacking these multimodal features even when paired with other APIs from Anthropic. It can generate tables and follow markdown formatting, but it doesn’t have any image or audio generation features, nor does the company feature interfaces where you can combine Claude 2.1 with “plugins” (or custom GPTs”).\n\nGPT-4 Turbo has greater versatility for projects needing a fusion of content types, allowing users for more flexibility.\n\n### Pricing",
      "score": 0.9998832,
      "raw_content": null,
      "favicon": "https://cdn.prod.website-files.com/5c97e8c9de94e8a3480419a5/5efe552e8f16ed0530a0d66e_favicon2x.png"
    },
    {
      "url": "https://docsbot.ai/models/compare/claude-2/gpt-4o",
      "title": "Claude 2 vs GPT-4o - Detailed Performance & Feature Comparison",
      "content": "Compare performance metrics between Claude 2and GPT-4o. See how each model performs on key benchmarks measuring reasoning, knowledge and capabilities. [...] Claude 2, developed by Anthropic, features a large context window of 100,000 tokens. The model costs 0.8 cents per thousand tokens for input and 2.4 cents per thousand tokens for output. It was released on July 11, 2023, and has shown strong performance in the MMLU benchmark with a score of 78.5 in a 5-shot scenario. [...] GPT-4ois13 months newerthan Claude 2.It has more recent training data (October 2023 vs Early 2023).GPT-4o has a larger context window (128K vs 100K tokens).Unlike Claude 2, GPT-4o supports image processing.\n\n## Pricing Comparison\n\nCompare costs for input and output tokens betweenClaude 2and GPT-4o.",
      "score": 0.99986446,
      "raw_content": null,
      "favicon": "https://docsbot.ai/apple-touch-icon.png"
    },
    {
      "url": "https://kimgarst.com/claude-2-vs-gpt-4/",
      "title": "Claude 2 vs GPT-4 in 2023: Comparing the Top AI Models - Kim Garst",
      "content": "So in areas like legal writing, scientific research, coding, accessibility, and safety, Claude 2 matches or surpasses GPT-4. It carves out niches where its specialized capabilities give Claude 2 an advantage over even a giant like GPT-4.\n\nFor broad conversational applications, GPT-4 remains state-of-the-art. But Claude 2's strengths make it a formidable challenger able to go toe-to-toe with GPT-4 in key domains. [...] For natural language processing broadly, GPT-4 remains state-of-the-art. Its sheer model scale and training on a massive internet corpus make it hard to match for conversing, writing, and answering open-ended questions.\n\nHowever, Claude 2 vs GPT-4, the first is competitive or even superior in several important domains: [...] So how do these AI marvels stack up head-to-head? This in-depth Claude 2 versus GPT-4 comparison analyzes all the metrics to reveal where each model excels. By evaluating speed, cost, features, niche accuracy, safety and a few other critical components, my goal is to help you understand the strengths and weaknesses of Claude 2 and GPT-4.\n\nRead my full head-to-head breakdown to learn which of these advanced systems best fits your needs and use cases in 2023 and beyond.",
      "score": 0.9997998,
      "raw_content": null,
      "favicon": "http://kimgarst.com/wp-content/uploads/2021/03/apple-touch-icon-57x57-1.png"
    },
    {
      "url": "https://wesoftyou.com/ai/claude-2-vs-gpt-4-comparison/",
      "title": "Claude 2 VS GPT-4: Comparing AI Language Models for 2025",
      "content": "WeSoftYou provides AI consulting and integration services, where we deal with multiple language models, including Claude 2 and GPT-4. That’s why we decided to share our view on them, dissecting core features, performance, applications, and potential. ChatGPT-4 VS Claude 2, so what will win?\n\n## Understanding Claude 2 VS GPT-4 [...] |  |  |  |\n --- \n| Feature | GPT-4 | Claude 2 |\n| Developed by | OpenAI | Anthropic |\n| Key strengths | Content creation, coding generation, problem solving | Mathematics, learning from debates, truthful modeling, ethics tuning |\n| Reading comprehension | 93rd percentile on the GRE reading test | 86th percentile  on the GRE reading test |\n| Writing | 89th percentile on the GRE analytical writing test | 96th percentile on the GRE analytical writing test | [...] Enter Claude 2 and GPT-4, two titans in the LLM landscape. Both leverage immense neural networks and vast datasets to generate human-quality text, pushing the boundaries of what AI can achieve. However, beneath the surface lie distinct strengths and weaknesses that make each model ideal for specific tasks.",
      "score": 0.99963164,
      "raw_content": null,
      "favicon": "https://wesoftyou.com/wp-content/uploads/2023/02/cropped-favicon-new-180x180.png"
    }
  ],
  "request_id": "11b25e86-b706-4676-89c4-69d2d067b1a8"
}