10 comments

  • sowbug 14 minutes ago
    Are these kinds of libraries a temporary phenomenon? It strikes me as weird that providers haven't settled on a single API by now. Of course they aren't interested in making it easier for customers to switch away from them, but if a proprietary API was a critical part of your business plan, you probably weren't going to make it anyway.

    (I'm asking only about the compatibility layer; the other tracking features would be useful even if there were only one cloud LLM API.)

  • mosselman 19 minutes ago
    Does this have a unified API? In playing around with some of these, including unified libraries to work with various providers, I've found you are, at some point, still forced to do provider-specific works for things such as setting temperatures, setting reasoning effort, setting tool choice modes, etc.

    What I'd like is for a proxy or library to provide a truly unified API where it will really let me integrate once and then never have to bother with provider quirks myself.

    Also, are you also planning on doing an open-source rug pull like so many projects out there, including litellm?

  • driese 29 minutes ago
    Nice one! Let's say I'm serving local models via vllm (because ollama comes with huge performance hits), how would I implement that in gomodel?
    • devmor 15 minutes ago
      This is way more interesting to me as well. I have projects that use small limited-purpose language models that run on local network servers and something like this project would be a lot simpler than manually configuring API clients for each model in each project.
  • pjmlp 2 hours ago
    Expectable, given that LiteLLM seems to be implemented in Python.

    However kudos for the project, we need more alternatives in compiled languages.

    • goodkiwi 36 minutes ago
      It’s also badly implemented - everything is a global import. Had to stop using it
    • santiago-pl 1 hour ago
      Agree and thank you! Please let us know if you'd like to give it a try and if you miss any feature in GoModel.
  • Talderigi 2 hours ago
    Curious how the semantic caching layer works.. are you embedding requests on the gateway side and doing a vector similarity lookup before proxying? And if so, how do you handle cache invalidation when the underlying model changes or gets updated?
    • giorgi_pro 1 hour ago
      Hey, contributor here. That's right, GoModel embeds requests and does vector similarity lookup before proxying. Regarding the cache invalidation, there is no "purging" involved – the model is part of the namespace (params_hash includes the LLM model, path, guardrails hash, etc). TTL takes care of the cleanup later.
  • indigodaddy 1 hour ago
    Any plans for AI provider subscription compatibility? Eg ChatGPT, GH Copilot etc ? (Ala opencode)
    • santiago-pl 49 minutes ago
      You are not the first person who has asked about it.

      It looks like a useful feature to have. Therefore, I'll dig into this topic more broadly over the next few days and let you know here whether, and possibly when, we plan to add it.

  • rvz 1 hour ago
    I don't see any significant advantage over mature routers like Bifrost.

    Are there even any benchmarks?

  • anilgulecha 2 hours ago
    how does this compare to bifrost - another golang router?
    • santiago-pl 1 hour ago
      First of all, GoModel doesn't have a separate private repository behind a paywall/license.

      It's more lightweight and simpler. The Bifrost docker image looks 4x larger, at least for now.

      IMO GoModel is more convenient for debugging and for seeing how your request flows through different layers of AI Gateways in the Audit Logs.

      • anilgulecha 1 hour ago
        That would be valuable if there's a commitment to never have a non-opensource offering under GoModel? If so, you can document it in the repo.
  • pukaworks 18 minutes ago
    [dead]
  • tahosin 1 hour ago
    This is really useful. I've been building an AI platform (HOCKS AI) where I route different tasks to different providers — free OpenRouter models for chat/code gen, Gemini for vision tasks. The biggest pain point has been exactly what you describe: switching models without changing app code.

    One thing I'd love to see is built-in cost tracking per model/route. When you're mixing free and paid models, knowing exactly where your spend goes is critical. Do you have plans for that in the dashboard?

    • santiago-pl 1 hour ago
      This comment looks like AI-generated.

      However IIUC what you're asking for - it's already in the dashboard! Check the Usage page.