Polimake

Google Analytics: from the Urchin acquisition in 2005 to the GA4 migration of 2023, and how to set it up well without overcomplicating it

Google Analytics explained with its real history: Urchin Software founded in 1995 and acquired by Google in March 2005, the launch of Google Analytics on November 14, 2005, the arrival of Universal Analytics in 2013, GA4 in October 2020, and the forced transition of July 1, 2023, when UA stopped processing data. How to set it up well and the most common mistakes.

· Platform

The team behind Polimake. We explore the intersection of technology, creativity, and automation.

Published:
Google Analytics: from the Urchin acquisition in 2005 to the GA4 migration of 2023, and how to set it up well without overcomplicating it

Google Analytics is Google's tool for measuring traffic, user behavior, and conversions on websites and apps. It is probably the most widely used web analytics platform in the world and, also, one that generates the most confusion among marketing professionals because of the sheer number of technical changes it has undergone in the last five years. Anyone who learned Google Analytics before 2020 is practically using a different tool from the current one; understanding why requires knowing the history.

This guide covers the origin of the tool, the fundamental changes that have transformed it, what it actually measures in its current version (GA4), and how to set it up well without falling into the mistakes that have left so many companies with useless dashboards.

The origin: Urchin Software, 1995-2005

The history of Google Analytics begins outside Google. Urchin Software Corporation was founded in San Diego, California, in 1995 by Brett Crosby, Paul Muret, and others. The company developed Urchin On Demand, a web log analysis solution that by 2003-2004 had grown into one of the serious options on the market, especially for mid-sized sites that needed something more than basic statistics but couldn't afford the cost of Omniture (then the leader in enterprise analytics).

Urchin's code and methodology were based on the analysis of web server log files. When a user visited a page, the server logged the request, and Urchin processed those log files periodically to build reports. It was a different model from the one that would later become popular with JavaScript tags, but it had advantages (it didn't lose data to blockers, and it captured identifiable bots and crawlers).

In March 2005, Google acquired Urchin Software for an amount not publicly disclosed, though it was estimated at approximately $30 million. Google's motivation was clear: integrate web analytics capabilities with its advertising services (AdWords, AdSense), allowing advertisers to measure the performance of their campaigns in a unified way.

Google Analytics: November 2005

Barely eight months after the acquisition, on November 14, 2005, Google launched Google Analytics as a free product to the public. The offering was radical: a professional web analytics tool, free of charge, accessible to any site. For companies that were paying thousands or tens of thousands of dollars a year for Omniture or others, the launch of free GA was a seismic event in the industry.

Initial demand was so high that Google had to temporarily close new sign-ups during the first week because its servers couldn't process the data. They reopened in January 2006 with expanded capacity. It is one of the few documented cases of a free tool being a victim of its own success at launch.

The first versions (2005-2008) still used code inherited from Urchin. The gradual transition toward Google-native infrastructure and the introduction of new features marked the following years: AdWords integration (2006), data sampling for large sites, advanced segmentation (2008-2010), Real-Time reporting (2011), Multi-Channel Funnels (2011).

Urchin as a standalone product continued to be sold until Google officially discontinued it in 2012, a decision that angered many enterprise customers who valued its on-premise model with no data sent to Google.

Universal Analytics: 2013

In October 2012, Google announced Universal Analytics, and the version left beta in April 2013. Universal Analytics represented a significant rewrite with important changes:

Cross-device tracking. For the first time you could track the same user across devices if they authenticated (via User ID).

Custom dimensions and metrics. The ability to inject custom data (product category, customer type, order value) beyond the predefined dimensions.

Server-side tracking via Measurement Protocol. The possibility of sending data to GA from outside the browser, useful for tracking physical POS systems, mobile apps, IoT.

Revised session model. More sophisticated session and user definitions, with configurable timeouts.

UA was the de facto standard in web analytics for ten years. The amount of tutorials, certifications, books, courses, and best practices that developed around UA is enormous. For many marketing professionals, "Google Analytics" means specifically UA.

GA4: October 2020 and the forced transition of July 2023

In October 2020, Google launched Google Analytics 4 (initially called App + Web during its beta phase from 2019). GA4 represented a rewrite from scratch with a completely different data model.

The fundamental changes:

An event-based model, not session-based. In UA, everything was organized in a Sessions → Pageviews → Events hierarchy. In GA4, everything is an event (including the pageview, which is a type of event called page_view). This difference seems technical but has a profound impact on how reports are built and interpreted.

Web + App unified. A GA4 property can collect data from both web and mobile apps, enabling cross-platform analysis that UA did with difficulty.

Predictive modeling with machine learning. GA4 includes predictions (purchase probability, churn probability, predicted revenue) generated by Google's ML models.

More native privacy restrictions. GA4 is designed to work with fewer cookies, supports consent mode, and has finer controls over personal data.

Removal of familiar metrics. The most painful change for those coming from UA: metrics like bounce rate were drastically redefined (in GA4 it was initially calculated as the inverse of "engaged session rate"). Others like exit rate disappeared from the standard reports (covered in detail in exit rate).

New interface. The layout, the preconfigured reports, and the exploration capabilities changed. Those who had mastered UA had to learn the new tool practically from scratch.

During 2020-2022, GA4 coexisted with UA. Many companies kept UA as their primary system and ran GA4 in parallel to start getting familiar. Google progressively pushed the migration with announcements and notifications.

And then, on July 1, 2023, Universal Analytics stopped processing new data for all standard properties (the free version). The 360 (paid) versions had an extension until July 2024, but they were also eventually migrated. Companies that had not configured GA4 before that date lost their ability to measure new traffic until they migrated.

The migration was not automatic. Configuring GA4 from scratch required technical decisions (which custom events to send, how to map conversions, how to configure audiences) that many sites didn't get right. The result: there are thousands of GA4 installations in production that are misconfigured, collecting incomplete or erroneous data. UA's historical data remained accessible for consultation until July 1, 2024, when Google removed access to those properties.

That forced transition from UA to GA4 is probably the most disruptive change in web analytics in the last fifteen years, and there are still companies in 2026 that haven't completed the migration well.

What GA4 actually measures

In its current form, GA4 captures five types of information worth understanding:

Acquisition. Where users come to your site from. Sources (Google, Facebook, email, direct), mediums (organic search, paid social, referral), campaigns (with specific UTMs). It's the answer to "how do they find me?"

Engagement. What they do once on the site. Pages visited, events triggered, time spent, scroll, clicks on links, downloads. It's the answer to "what do they do?"

Demographics and technology. Who they are and with what technology. Country, language, city, device, operating system, browser. It's the answer to "who are they?"

Conversions. Specific actions you've marked as goals. Purchases, sign-ups, downloads, contacts. In GA4 these are called "key events" since March 2024 (previously "conversions"). It's the answer to "what valuable actions do they complete?"

Audiences. Groups of users defined by rules (buyers, abandoners, etc.) that can be used for report segmentation and for activation in Google Ads. It's the answer to "what types of users do I have?"

By default, GA4 collects basic events (page_view, scroll, click, file_download, video_start) and captures a lot of information without configuration. But to be useful, it almost always requires additional configuration: business-specific custom events, well-marked key events, relevant parameters associated with events.

How to set up GA4 well (without overcomplicating it)

For a company that wants to use GA4 productively, there's a reasonable order of configuration:

Install the code correctly. Via Google Tag Manager (recommended for flexibility) or via gtag.js directly. Verify that the installation works on all pages with the GA Debugger extension or other tools.

Configure the basic parameters of the data stream. Internal traffic filtering, retention settings, connections with Google Ads, Search Console, BigQuery (for companies that want exportable raw data).

Define the key events that matter to the business. Not all events are conversions. Purchases, qualified leads, sign-ups are typically key events. General page views are not.

Configure UTMs on all external campaigns. If what you see in GA4 comes from a campaign tagged with UTMs, you can attribute it well. If campaigns aren't tagged, everything appears as direct or referral with no useful information.

Create relevant audiences. Buyers, cart abandoners, returning users, key business segments. This enables more useful reports and activation in Ads.

Build your own reports or explorations. GA4's preconfigured reports are limited. Almost any serious analysis requires building specific reports.

Connect with BigQuery if the site has high volume. The ability to export data to BigQuery (free in GA4) enables analysis with SQL without the interface's limitations, and overcomes the sampling limits that apply in large explorations.

Without those steps, GA4 installed with default configuration collects data but produces reports that don't answer the business's questions.

The real limitations of GA4 in 2026

It's important to be honest about what GA4 does well and what it doesn't:

What it does well. Basic capture of traffic, sources, simple conversions; integration with Google Ads; analysis of simple funnels; real-time reports; automatic predictive modeling; zero cost.

What it does so-so. Multi-touch attribution (the available models are limited), cohort analysis (it exists but is basic compared to dedicated tools), advanced customization capability (some things UA allowed now require more effort).

What it does poorly or doesn't do. Advanced funnel analysis with complex steps (Mixpanel, Amplitude do it better); native A/B testing (Optimize was discontinued, you need another tool); session replays and heatmaps (Hotjar, FullStory); qualitative analysis of traffic quality; robust server-side tracking (requires significant technical configuration).

For companies with serious analytical needs, GA4 is one tool among several, not a complete solution. The typical combination in 2026:

  • GA4 as the primary measurement system and connection with Google Ads.
  • Hotjar or Microsoft Clarity for session recordings and heatmaps.
  • Mixpanel or Amplitude for deep product and funnel analysis.
  • Looker Studio or Power BI for visualization and executive dashboards.
  • BigQuery + SQL for ad-hoc analysis without restrictions.
  • Plausible, Fathom, or similar as privacy-first alternatives for small sites that don't need the complexity of GA4.

Common mistakes in GA4

Not filtering internal traffic. The team browsing the site will inflate metrics if their traffic isn't excluded (by IP or by using an internal blocker). It distorts all reports.

Not marking conversions / key events. Without this, GA4 measures traffic but not results. It's like having a speedometer but not knowing where you're going.

Not using UTMs on external campaigns. Without UTMs, you can't attribute traffic to specific campaigns. It's one of the basic habits that the most companies still get wrong.

Confusing UA's bounce rate with GA4's bounce rate. The definitions changed. Comparing numbers across versions produces false conclusions.

Making decisions with small data. If your site has 500 monthly visits, the metrics have high noise and small differences aren't significant. GA4 shows the numbers but doesn't warn about statistical confidence.

Ignoring sampling. In large properties, GA4 can apply sampling in complex explorations. The data shown is extrapolation, not a census. The interface indicates it but many users ignore it.

Not updating the configuration. Sites change (new pages, new products, new channels). If GA4's initial configuration isn't updated, the reports age.

Reporting to management without context. "Traffic dropped 15% this month" without mentioning seasonality, the Google algorithm change, or the campaign that ended is information without context that produces wrong decisions.

Trusting platform attribution 100%. As direct advertising covered, GA4's multi-channel attribution (and that of any platform) is an approximation. For large investment decisions you need to combine it with other sources.

Not reviewing privacy settings. GA4 sends data to Google and processes it. To comply with GDPR and other frameworks, you need well-configured consent mode, appropriate data retention, IP anonymization. Without that, there's real legal risk.

The reality of privacy and privacy-first alternatives

Something that deserves its own paragraph: GA4 is a free Google product that is sustained economically by the value the aggregated data brings to Google. For companies with serious privacy concerns—especially in European markets with active data protection authorities—GA4 has generated controversy.

Several European authorities (Austria in January 2022, Italy in June 2022, France in February 2022, among others) issued decisions declaring that the use of Google Analytics (in its standard form, without additional measures) violated the GDPR because it transferred personal data to the United States without guarantees equivalent to European ones. After the new EU-US data transfer framework came into force in July 2023 (the EU-US Data Privacy Framework), part of that controversy eased, although conservative legal interpretations persist.

In this context, privacy-first alternatives have grown:

  • Plausible Analytics (founded in Estonia, 2018) — simple analytics, cookieless, natively GDPR-compliant.
  • Fathom Analytics (Canada, 2018) — a similar approach, data on its own infrastructure.
  • Matomo (formerly Piwik, founded 2007) — open source, self-hostable.
  • Simple Analytics, Cloudflare Web Analytics, Pirsch, and other alternatives with similar models.

For small companies or those with specific privacy concerns, these alternatives are a viable option. For large companies with deep integration into the Google Ads ecosystem, leaving GA4 is a more complex decision.

Google Analytics and creative operations

For a brand that produces content and campaigns regularly, GA4 (or any analytics tool) is the feedback system that closes the loop between creative production and result. Without that feedback, the creative team operates blind; with well-structured feedback, decisions about what to produce more of, what to optimize, and what to discard are based on real signal.

That connection is the discipline of creative operations: creative KPIs rest on data that typically comes from GA4 (among other sources), the editorial calendar is adjusted based on what the analytics reveal works, and the documented learnings about which type of content performs best influence the next generation of pieces.

At Polimake that logic lives on three surfaces: Studio integrates analytical learnings into planning, Studio reflects validated patterns in production, and Media documents which assets performed historically so that information is accessible when producing new pieces.


If you manage marketing, content, product, or any role that requires measuring digital results and you've arrived here looking for an answer about Google Analytics, the most useful thing you can take from this article is probably the combination of three ideas: GA4 is a different tool from UA (don't assume your prior knowledge transfers directly), a default installation collects data but rarely produces decisions (conscious configuration is where the value is), and GA4 is one of several honest analytics tools (combining it with session recordings, product tools, and independent validation produces better conclusions than relying on it alone).

To complement this, exit rate covers a specific metric that changed a lot with the GA4 migration, Google Trends covers the complementary tool for general trends, and conversion funnel covers the concept that GA4 measures but didn't invent.

Quick references