• 13 days ago

I asked AI to rank popular travel blogs for their authenticity, here's what I learned

Recommend this?

Comments

Newest / Chronological

Over the past months I’ve been increasingly curious whether AI can still distinguish between scaled travel content and writing based on genuine first-hand experience.

So I ran a small experiment.

I tested this across multiple prompts and evaluation criteria to see whether consistent patterns would emerge.

I asked AI a simple question: "which travel blogs still show strong signals of first-hand experience in 2026?"

Instead of focusing on popularity or traffic, I asked it to evaluate signals such as:

• depth of observation
• specificity of details
• structure and practical usability
• consistency across articles
• what I call data density (information that is difficult to generate synthetically at scale)

The results were surprisingly consistent. Several well-known blogs appeared exactly where expected - but a few unexpected ones showed up as well.

What interested me most was not the ranking itself, but the reasoning behind it.

Across multiple prompts, AI repeatedly highlighted the same indicators of authenticity: first-hand specificity, narrative continuity, logistical detail, and decision-level guidance rather than generic inspiration content.

In other words, signals that are increasingly relevant in an era of large-scale AI-assisted publishing, especially as many travel blogs are currently experimenting with AI-assisted workflows.

I described the full methodology and results on my website: en.gancarczyk.com/The-best-English-language-travel-blogs-according-to-AI-ranking--justification-for-the-choice-and-my-commentary/

I’d be very curious whether others here observe similar patterns when comparing experience-driven blogs with template-style travel content.

13 days ago
Founder, Travel Massive

Hi Jacek, welcome to the Travel Massive community and thanks for sharing your research and insights here. Also, impressive travel blog!

I followed your research and have a few thoughts about the methodology:

1. Copy/paste of your prompt into Google search AI (following your video) returned an instant, verbatim result of what you published. This makes it hard to replicate your findings (replicate != cache).

2. I tried the same prompt with Claude and it returned a similar set of websites. The rankings were different though. Therefore, running the same test with multiple AIs and get different results. How would you tell what AI is more accurate?

3. I then asked Claude "explain how you were able to assess the visual authenticity score, were you looking at the photos on the websites?" and it admitted "I did not browse any of the websites or examine any photographs. The scores I assigned are modeled estimates based on my training data, not live assessments".

Therefore I feel that your findings are somewhat flawed / limited because the assessment of visual authenticity was not actually assessed by your AI (I doubt that Google AI actually looked). This caveat would be worth pointing out on your blog article to readers.

imo, the only way to really assess visual authenticity would be an AI agent that controls a headless browser and has image recognition and is trained to a level that can make such an assessment. That requires a lot of energy - more than a search engine or chatgpt can afford to serve you in a casual LLM chat.

I'd also check whether the LLM actually read/accessed any of the blogs in the study or just made it up!

A v2 of your methodology would be to instruct a desktop agent to visit the websites and perform the assessment.

Thanks again for sharing and that's food for thought!

12 hours later (edited)

Hi Ian,

Thank you very much for your thoughtful comments and for taking the time to replicate the experiment yourself.

Your observations are very accurate and they actually match my later experiences when I continued testing how different AI systems behave in similar scenarios. I ran additional experiments using the paid Gemini model in several modes (Fast, Thinking, and Deep Search), and the results confirmed many of the limitations you pointed out.

However, in the article I deliberately focused on the simplest possible model: the free AI integrated directly into Google Search.

This choice was intentional. Most internet users (roughly 95%) rely on answers generated directly inside the search engine interface and never use advanced paid language models for deeper analysis. Because of that, my goal was not to obtain the most objective or technically rigorous ranking possible, but rather to observe what kind of answer a typical user actually receives today.

In other words, I wanted to understand how the “default AI layer” of the web currently interprets signals of first-hand travel experience — including all its weaknesses and constraints.

I also hoped that publishing the methodology would encourage others to repeat the experiment across different models, browsers, prompts, and geographic locations, because these factors clearly influence the results. Your comparison using Claude is exactly the kind of independent verification I was hoping to see, so thank you for doing that.

From my additional testing it appears that even advanced paid models often do not directly analyse the full content of the evaluated websites in real time. Instead, they rely heavily on training data, previously indexed material, and already existing signals available across the web. In many cases they synthesise patterns rather than performing a full live evaluation of each source.

Interestingly, even when explicitly instructed to read and analyse a complete website, models frequently respond based on partial representations rather than a comprehensive review of the actual page content. This suggests that what we are observing is not a traditional audit process, but a probabilistic reconstruction based on prior knowledge signals.

For this reason I fully agree with your comment regarding visual authenticity scoring. Without an agent capable of actively browsing websites and analysing images directly, such evaluation cannot be treated as literal verification.

At the same time, I think this limitation is part of what makes the experiment interesting. The goal was not to measure objective authenticity itself, but to observe how current public AI systems interpret authenticity signals when asked this type of question.

And since these systems increasingly function as a first layer of discovery for readers, understanding their behaviour may already be relevant for travel publishers today.

Finally, regarding the computational aspect — I suspect another practical constraint is efficiency. A truly deep, page-by-page analytical assessment across multiple travel blogs would require significantly more processing time and resources than what real-time conversational AI systems are currently designed to provide at scale. Most users expect answers instantly, not after twenty minutes of structured crawling and evaluation, so models optimise for speed and plausibility rather than exhaustive verification.

Thanks again for your valuable feedback. It definitely helps refine the direction for a possible second version of the methodology.

7 hours later
Founder, Travel Massive

I suspect that Google is experimenting with how much AI compute they give users in this new feature. The cost to answer complicated queries with LLM can't possibly be profitable in comparison to their traditional search engine.

When the venture capital runs out we are all going to pay for AI. I'm not sure how search will look when that happens!

1 hour later (edited)
Founder, Travel Massive

This video explains a bit about what's happening with AI models right now and to your point why Google doesn't need to charge

16 hours later

Absolutely great 👍

9 hours later
Travel Entrepreneur & Founder, Wander Woofs

This is very interesting, and is a struggle I sometimes have when writing blog posts about my pet-friendly travel company. I use the help of AI to organize the structure of the post and the information that I have to share - the practical side, which is very important - but I always make sure to include personal experiences, so not only the reader can feel like we`re fellow dow owners sharing our experiences while traveling with our dogs, but that I am also an experienced person on the pet-freindly travel world who can help them.

Finding the balance so it doesn`t look like it`s a fully AI post is not always easy!

6 days ago

Reactions

Post icon

I asked AI to rank popular travel blogs for their authenticity, here's what I learned

I asked AI to rank popular travel blogs for their authenticity, here's what I learned was posted by Jacek in Discussion , AI , Blogging , Research . Featured on Apr 30, 2026 (14 days ago). This post is not rated yet.

Sharing is caring