Podría interesarle…

Lenovo's AI and edge server portfolio brings GPU-accelerated AI inference to Latin American organizations without requiring cloud connectivity. The ThinkSystem SR670 V3 AI handles multi-GPU training and inference in the data center, while the ThinkEdge SE series deploys AI at industrial sites and maquilas where cloud latency is unacceptable. GLADiiUM Technology Partners delivers and manages these systems across Honduras, Panama, Costa Rica, Miami and Puerto Rico.

ThinkSystem SR670 V3 AI and ThinkEdge SE series for GPU-accelerated AI inference and training at the data center and the edge — run AI models locally without cloud dependency for latency-sensitive and data-sovereign workloads in Latin America

Not every AI workload belongs in the cloud. For Latin American organizations running AI inference on sensitive financial data that cannot leave the country, manufacturing quality control AI that requires sub-millisecond latency at the production line, or healthcare AI that processes protected patient information under HIPAA, running AI models on local infrastructure is not just preferable — it is required.

Lenovo’s AI server portfolio is purpose-built for this need: the ThinkSystem SR670 V3 for multi-GPU enterprise AI training and high-throughput inference in the data center, and the ThinkEdge SE series for AI inference at the edge — in manufacturing plants, retail locations, branch offices and any environment where cloud connectivity is unreliable, latency is critical or data sovereignty is required.

GLADiiUM Technology Partners is the authorized Lenovo infrastructure partner in Latin America, deploying AI server infrastructure for organizations in Honduras, Panama, Costa Rica, Miami and Puerto Rico that want the performance of GPU-accelerated AI without the operational complexity of managing cloud GPU instances.

Lenovo ThinkSystem SR670 V3 — Enterprise AI Server

The ThinkSystem SR670 V3 is Lenovo’s flagship enterprise AI server, designed for demanding GPU-accelerated workloads in the data center. Key specifications and capabilities:

Dual Intel Xeon Scalable processors — 4th Gen Intel Xeon (Sapphire Rapids) with up to 60 cores per processor and PCIe 5.0 interconnect for maximum GPU bandwidth
Up to 8 x double-width PCIe 5.0 GPUs — supports NVIDIA H100, A100, L40S and A30 GPUs in various configurations depending on workload requirements
Up to 4TB DDR5 ECC memory — critical for large language model inference where model weights must fit in system memory
NVLink bridge support — enables high-speed GPU-to-GPU communication for multi-GPU training workloads
Hot-swap storage and redundant power — enterprise reliability for production AI inference workloads that cannot tolerate downtime

SR670 V3 Use Cases for Latin American Organizations

LLM inference on-premise — Run open-source LLMs (Llama 3, Mistral, Mixtral) locally on a single SR670 V3 node, eliminating cloud API costs and data sovereignty concerns for organizations processing sensitive data
Computer vision AI for manufacturing — Deploy quality control vision models on a data center SR670 V3 that processes camera feeds from multiple production lines simultaneously
AI model fine-tuning — Fine-tune foundation models on proprietary data using SR670 V3 GPU clusters, keeping sensitive training data entirely on-premise
Financial AI inference — Run credit scoring, fraud detection and AML models locally at financial institutions where regulatory requirements prevent sending customer data to cloud AI APIs

Lenovo ThinkEdge SE Series — Edge AI Inference

The ThinkEdge SE series brings GPU-accelerated AI inference to environments outside the traditional data center — factory floors, retail locations, hospital wards, logistics hubs and any location where sending data to the cloud is impractical, too slow or prohibited.

ThinkEdge SE350 V2

A compact, short-depth 1U server designed for deployment in network closets, retail back offices and edge locations without dedicated server room space. Intel Xeon D processor, optional NVIDIA T4 GPU for inference, and ruggedized operating specifications (wide temperature range, vibration tolerance). Ideal for retail AI applications, branch office inference and light manufacturing edge workloads.

ThinkEdge SE455 V3

A more capable edge AI platform supporting up to 2 x NVIDIA L4 or A2 GPUs for higher-throughput inference. Designed for industrial AI applications including computer vision at production lines, predictive maintenance at manufacturing facilities, and smart logistics at distribution centers. Operates in harsh environments with extended temperature tolerance and vibration resistance.

ThinkEdge SE360 V2

An ultra-compact, fanless edge computing platform for IoT and inference workloads at the extreme edge — directly on the factory floor, in retail displays or in outdoor enclosures. Optional NVIDIA Jetson module for GPU-accelerated AI inference at ultra-low power consumption. Ideal for distributed AI inference across many locations where each node processes local data without sending it to a central server.

Complete Data Sovereignty

Run AI models entirely within your facility. No data leaves your network. Critical for financial institutions, healthcare organizations and manufacturers with sensitive IP under CNBS, HIPAA or client contractual requirements.

Low-Latency Inference

GPU-accelerated inference delivers responses in milliseconds. No network round-trip to a cloud API. Essential for production line AI, real-time fraud detection and interactive customer-facing AI applications.

Predictable Cost vs Cloud API

Eliminate recurring cloud AI API costs for high-volume inference workloads. On-premise AI infrastructure pays for itself in 12-24 months for organizations with significant AI API consumption.

Works Without Internet

ThinkEdge SE series operates in environments with intermittent or limited internet connectivity. AI inference continues locally regardless of WAN connectivity status.

Industrial and Edge Ready

ThinkEdge platforms designed for factory floors, industrial environments and commercial locations without dedicated data center conditioning.

Unified Management

Lenovo XClarity Administrator provides unified management of all ThinkSystem and ThinkEdge platforms from a single console. GLADiiUM manages ongoing operations as part of infrastructure managed services.

Frequently Asked Questions — Lenovo AI Servers Latin America

¿Cuándo debería una organización latinoamericana ejecutar IA en las instalaciones en comparación con la nube?

On-premise AI infrastructure is the right choice when: (1) data sovereignty requirements prohibit sending data to external APIs (CNBS-supervised financial institutions, healthcare organizations under HIPAA, manufacturers with client IP confidentiality obligations); (2) latency requirements are incompatible with cloud round-trip times (manufacturing quality control AI, real-time fraud detection, interactive customer AI); (3) inference volume is high enough that the operational cost of cloud AI APIs exceeds the amortized cost of on-premise GPU hardware; or (4) connectivity to cloud regions is unreliable (edge locations, rural industrial sites). For organizations that need occasional AI capabilities for moderate volumes without data sovereignty constraints, cloud APIs (Azure AI Foundry, AWS Bedrock) remain the more economical choice.

¿Qué modelos de IA pueden ejecutarse en un Lenovo SR670 V3?

The SR670 V3 with NVIDIA H100 GPUs can run most open-source LLMs at production quality: Llama 3 70B, Mistral 7B and Mixtral 8x7B run comfortably at high throughput. With multiple H100 GPUs connected via NVLink, larger models including Llama 3 405B become feasible. For vision AI, the SR670 V3 can process video streams from multiple cameras simultaneously for quality control and security applications. GLADiiUM sizes SR670 V3 configurations based on the specific models and throughput requirements of each client.

¿Cómo colabora GLADiiUM en los despliegues de servidores de IA de Lenovo en América Latina?

GLADiiUM provides factory-authorized deployment and configuration of Lenovo AI servers, including GPU driver installation, CUDA/ROCm environment setup, container runtime configuration and integration with AI framework environments (PyTorch, TensorFlow, TensorRT). Post-deployment, we provide Lenovo Premier Support management, hardware monitoring via XClarity and optional AI operations managed services for clients who want GLADiiUM to manage model deployment, performance monitoring and capacity planning on their AI server infrastructure.

Run AI Locally with Lenovo AI Servers

GLADiiUM will assess your AI workload requirements, evaluate on-premise vs cloud economics for your specific use case, and design a Lenovo AI server configuration sized for your inference or training needs.