JAMES SCHLAUCH · CONSULTING PRACTICE

Signal.

What I track. Frontier model benchmarks, recent releases, San Diego AI/ML community signals, and the market moves that change my recommendations. Curated weekly. Last updated .

LIVE TRACKER 11 signals tracked
BENCHMARKS

Frontier model benchmarks

What's leading the public arenas this week.

  • Frontier-tier ELO scores between top three labs are inside 12 points — within statistical noise. For practical buyer decisions, the tie at the top means model selection should be driven by latency, cost, and tooling fit, not arena rank.

    LMSYS Benchmarks
RELEASES

Model releases

What shipped, and what it means for production.

  • The 1M-token context window for Opus 4.7 leaves beta. Practical implication for active engagements: full-codebase RAG indexes can be replaced with single-prompt context loads on the 200K+ files-per-prompt path. Cache hit rate becomes the cost-determining variable.

    Anthropic Frontier
  • Anthropic's frontier reasoning model gains a 1M-token context window in beta. Practical implication: full-codebase analysis in a single prompt becomes viable for medium-sized monorepos. Pricing premium relative to 200K-window tier; cache hit rate becomes load-bearing for cost.

    Anthropic Frontier
TOOLING

Tooling & infra

Frameworks, runtimes, and platforms moving the operating cost.

  • MLX 3.0 lands unified-memory model serving for Apple Silicon, collapsing CPU/GPU transfer overhead for on-device inference. For SoCal teams running edge-AI prototypes on M-series workstations, this changes the local-development cost curve and may shift some 'GPU-required' workflows back to laptop-class hardware.

    Apple ML Research Tooling
  • Vercel AI SDK 5 makes streaming tool calls the default pattern. For Astro/Next-based production AI surfaces, this collapses a meaningful chunk of glue code. Practical implication: prototypes ship a week earlier; production review cycles unchanged.

    Vercel Tooling
  • Independent crawl reports llms.txt adoption above 38% among top-1000 ranked sites in technical-content categories — up from <8% at start of Q1. Generative-engine optimization is no longer a trailing-edge bet.

    Hugging Face GEO/AEO
POLICY

Policy & governance

What regulators and standards bodies are doing.

  • Public comment period opened on proposed rules requiring disclosure of AI-generated content in commercial communications. Operative for any consumer-facing AI workflow; regulated-industry buyers should expect compliance-review pickup within Q3.

    Federal Trade Commission Governance
MARKET

Market signals

Funding, M&A, and structural moves in the practice's domain.

  • Aggregator passes through Q2 frontier-model price drops; cost-per-million-tokens for top-tier reasoning models down 18-24% versus Q1. Implication for engagements with active token spend: re-bid the next 90 days.

    OpenRouter Market
LOCAL

San Diego AI/ML calendar

What's happening in the local community.

  • May 2026 GenAI/agents meetup at the Google Cloud San Diego venue. Active CFP for practitioner talks; sponsor slots include venue + food sponsorship paths. Highest-density local builder audience in San Diego right now.

    AICamp Community
  • Recurring SD meetup for GenAI/LLM/agent practitioners. Strong venue for practitioner-tier conversations and informal benchmarking among local builders. Open to sponsor and speaker proposals.

    AICamp Community
  • Conference + workshops + bootcamp at the Hyatt Regency La Jolla, June 1–5. Closest analog to a pure-ML conference in San Diego this cycle. Sponsor and speaker calendars worth tracking for 2027.

    MLcon Community

Methodology

How this tracker works.

Signals are curated weekly from the LMSYS Chatbot Arena leaderboard, the OpenRouter model rankings, the Hugging Face trending board, the Stanford AI Index, the AI Now Institute, the Federal Trade Commission's AI rulemaking docket, the local San Diego AI/ML/Computer-Vision Meetup calendar, and selected industry publications. Each entry links to the primary source.

The tracker exists for one reason: I want my buyers — VPs of Engineering, CDOs, and Chief AI Officers — to be able to read one page once a week and know what changed in their domain. If a signal here changes a recommendation I'm giving in an active engagement, that's the right cadence.

Automation roadmap: a content pipeline (see writing ) will surface candidate signals from the RSS feed of San Diego AI news and frontier benchmark releases. New signals are reviewed and pushed weekly.

RSS feed for writing essays Talk through which tier fits