Question 1

What is the difference between on-premise AI and Swiss cloud AI?

Accepted Answer

AI hosting in Switzerland with a local cloud provider (e.g. Exoscale, Infomaniak) reduces geographic risk, but the provider technically has access to data and models. An on-premise deployment means the servers are physically in your premises or in a datacenter under your direct contract: only you have access to the data, logs and models. For companies subject to banking, medical or professional secrecy, this is often the only acceptable option.

Question 2

Are open-weights models as performant as GPT-4 or Claude?

Accepted Answer

On well-scoped business tasks — information extraction, classification, structured text generation, Q&A on a corpus — models like DeepSeek V4, Mistral Medium 3.5 or Qwen 3.6 are today neck-and-neck with proprietary models on 2026 public benchmarks. On some complex reasoning tasks or highly specialised programming, proprietary models retain a slight situational edge. Our role is to objectively evaluate both options on your real use case, with concrete metrics, before recommending.

Question 3

What hardware is needed for a local enterprise LLM?

Accepted Answer

A quantised 7-billion-parameter enterprise local LLM fits on a single NVIDIA A10G GPU (24 GB VRAM) with a latency of 50 to 150 ms per request. A 70-billion model requires 2 to 4 A100 GPUs (80 GB) in tensor parallelism. For companies without existing GPU infrastructure, we can size and source the hardware, configure it and maintain it. The cost of a dedicated GPU server is amortised in 12 to 24 months compared to cloud API costs at high volumes.

Question 4

How do you ensure nLPD compliance with an internal AI?

Accepted Answer

The nLPD (Swiss Federal Act on Data Protection) requires, among other things, transparency on processing, data minimisation and technical security. An on-premise deployment structurally meets the localisation and access control requirements. We help document the processing register, configure access rights and audit logs, and draft the information notices when the AI processes personal data.

Question 5

Can local models and cloud APIs be combined depending on the type of request?

Accepted Answer

Yes — it's often the optimal architecture. Sensitive requests (customer data, confidential documents) stay processed locally by an open-source or local AI model. Generic or less sensitive requests can go through a cloud API to leverage more powerful models. A request router based on content classification and metadata automates this dispatching. We call this architecture 'hybrid sovereign' and it is today our default recommendation for most companies.

Sovereign AI in Switzerland — on-premise deployments

Four pillars of sovereignty.
One shared architecture.

Data that doesn't leave

Performant open-weights models

Structural compliance

Hybrid sovereign · request router

Scoping, benchmark, sovereign deployment.
Native compliance, not contractual.

Sovereign scoping

Benchmark on your data

Deployment & governance

Our open-weights models
and the on-premise infra.

Frequently asked questions.

Related services.

Got a use case in mind?
Let's talk.

Four pillars of sovereignty.One shared architecture.