Company: Qualcomm China Job Area:Engineering Group, Engineering Group Software Engineering General Summary: Responsible for Linux system performance analysis and optimization on upstream kernel based products on Qualcomm platforms. You will identify performance bottlenecks, drive root-cause analysis, develop fixes/improvements,
For two decades, NVIDIA has pioneered visual computing through the invention of the GPU, the engine of modern accelerated computing. Today, this foundation powers breakthroughs across gaming, film, scientific research, autonomous machines, and robotics. NVIDIA is
AI Vision Processors For Edge Applications Our solutions make cameras smarter by extracting valuable data from high-resolution video streams. Job Description Develop a comprehensive, robust, and verifiable security solution in chip and SoC, and validate its
At Sonos we want to create the ultimate listening experience for our customers and know that it starts by listening to each other. As part of the Sonos team, you’ll collaborate with people of all styles,
Company Description At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible. At our core, Western Digital is a company of
Minimum qualifications: Bachelors degree or equivalent practical experience. 8 years of experience in software development. 5 years of experience testing, and launching software products. 5 years of experience working with embedded operating systems. 3 years of experience
Company Description WD is building the infrastructure behind the AI-driven data economy. As AI scales, so does data. Every interaction, every model, every system generates data that must be stored, managed, and made accessible over time. That’s
AI Vision Processors For Edge Applications Our solutions make cameras smarter by extracting valuable data from high-resolution video streams. Job Description Responsibilities Software development and troubleshooting, including issues related to ARM drivers, audio/video capture, audio/video encoding,
Company: Qualcomm China Job Area:Engineering Group, Engineering Group Software Engineering General Summary: General Summary Qualcomm is leveraging its expertise in wireless and computing technologies to drive a major technological revolution in the automotive industry. Having led
Company: Qualcomm China Job Area:Engineering Group, Engineering Group Systems Engineering General Summary: Voice AI refers to voice activated Gen-AI and it is revolutionizing the way people use mobile, compute, XR & IOT devices. Qualcomm is at the
Company: Qualcomm China Job Area:Engineering Group, Engineering Group Software Engineering General Summary: Responsibilities We’re looking for skilled senior engineer to develop linux audio drivers on snapdragon SoC based products. You will be responsible for new audio feature
Company: Qualcomm China Job Area:Engineering Group, Engineering Group Software Engineering General Summary: General Summary: We are looking for software engineer to develop Linux platform drivers and system for Qualcomm automotive IVI/ADAS products. Responsibilities Bring up Linux and Android GVM
Company: Qualcomm China Job Area:Engineering Group, Engineering Group Software Engineering General Summary: Company Overview Qualcomm is a company of inventors that unlocked 5G ushering in an age of rapid acceleration in connectivity and new possibilities that
Who We Are Zenni pioneered the online eyewear industry in 2003 with a mission to make eyewear affordable and accessible for everyone. With complete prescription pairs starting under $10, Zenni offers adults and children the freedom
AI院-推理Infra工程师(量化算法研究/推理框架优化/GPU优化) 北京、上海 全职 互联网 / 电子 / 网游 职位描述 【方向一】量化算法研究员-职位描述通过前沿的模型量化、压缩与推理加速技术,显著降低大语言模型及多模态模型的存储占用与计算成本,推动 LLM 的大规模部署。-工作内容1、研发及改进 PTQ(训练后量化)、QAT(量化感知训练)、混合精度量化等核心算法,针对LLM/VLM(大语言模型/视觉语言模型)设计定制化量化方案,持续优化模型精度与推理效率的平衡;2、探索并实践低比特量化(如INT8/INT4/FP8/FP4)、权重稀疏化、知识蒸馏等协同压缩技术,提升压缩率同时控制精度损失;3、开发及优化量化工具链,完成对 GLM 系列模型的转换、量化校准及部署集成;4、 跟踪学术界与工业界前沿量化技术,通过论文复现、实验对比推动技术迭代。-职位要求1、计算机科学、电子工程、数学等相关专业硕士及以上学历,3 年以上模型量化或推理加速经验;或优秀本科生具备扎实项目履历;2、深入理解 Transformer 架构及 LLM 推理流程,精通 Python,熟悉常见的开源 LLM 推理框架(sglang/vllm/trtllm 等);3、掌握量化原理(校准策略、量化粒度、误差分析)及主流算法(如GPTQ、AWQ);4、具有 CUDA/Triton 编程经验,能自主实现高性能算子或优化内核计算加分。【方向二】推理框架优化工程师-职位描述1、高性能算子开发与优化:负责AI模型(尤其是大语言模型及多模态模型)在GPU上的核心算子(Kernel)的设计、开发与极致性能优化,支撑训练和推理场景的高效运行。2、性能分析与调优:深入分析GPU应用程序的性能瓶颈,通过优化内存访问模式、线程调度、执行效率等手段,显著提升计算密集型任务的吞吐量和降低延迟。3、技术集成与应用:研究并应用业界前沿的优化技术(如模型量化QAT/PTQ、算子融合、动态形状支持、FlashAttention等),并将其集成至推理/训练引擎。-职位要求1、编程能力:具备3年及以上GPU编程与高性能计算优化经验,深入理解GPU架构、并行计算原理、计算机体系结构,具备高性能计算内核的开发与优化经验。2、精通C/C++,具备扎实的编程基础、良好的编程风格和丰富的调试经验;熟练掌握Python;熟悉Linux开发环境。3、性能优化经验:能够熟练使用Nsight Compute、Nsight Systems等GPU性能分析工具,有实际的性能优化案例和成果,能独立定位和解决复杂的性能问题。4、算法基础:熟悉基础数学函数、线性代数、矩阵运算、数值计算等数学库相关算法,了解深度学习常见算子的计算方式。【方向三】GPU优化工程师-职位描述利用对 cuda 生态软件和底层体系结构的了解,帮助团队优化训练和推理的计算效率。-工作内容1、高性能算子开发与优化:负责AI模型(尤其是大语言模型及多模态模型)在GPU上的核心算子(Kernel)的设计、开发与极致性能优化,支撑训练和推理场景的高效运行。2、性能分析与调优:深入分析GPU应用程序的性能瓶颈,通过优化内存访问模式、线程调度、执行效率、多流并行协同等手段,显著提升计算密集型任务的吞吐量和降低延迟。3、技术选型:对 GPU 领域相关的 DSL/编译器(例如 triton/cuteDSL/tilelang)等进行尝试和了解,确定团队内的 DSL/编译器的技术选型,为未来的迭代做好技术储备。-职位要求1、编程能力:具备3年及以上GPU编程与高性能计算优化经验,深入理解GPU架构、并行计算原理、计算机体系结构,具备高性能计算内核的开发与优化经验。2、精通C/C++,具备扎实的编程基础、良好的编程风格和丰富的调试经验;熟练掌握Python;熟悉Linux开发环境。3、性能优化经验:能够熟练使用Nsight Compute、Nsight Systems等GPU性能分析工具,有实际的性能优化案例和成果,能独立定位和解决复杂的性能问题。4、算法基础:熟悉基础数学函数、线性代数、矩阵运算、数值计算等数学库相关算法,了解深度学习常见算子的计算方式。 职位要求 -
At NVIDIA, we are closing the embodiment gap. We don’t just build robots; we build digital and physical nervous systems that allow humans to teach robots. You will lead the development of DexUMI (Dexterous Universal Manipulation Interface),
NVIDIAs Silicon Co-design Group (SCG) sits at a rare intersection: we own the full product development lifecycle, from early architecture definition through silicon bringup to product release. Our ArchDev team is the hub for silicon and
NVIDIA is the world leader in GPU Computing. We are passionate about markets including Data centers, Gaming, Professional vision, Automotive, HPC and networking. We are well positioned as the “AI Computing Company”, and our GPUs are
Company Description Do you want beneficial technologies being shaped by your ideas? Whether in the areas of mobility solutions, consumer goods, industrial technology or energy and building technology - with us, you will have the chance
算法嵌入式部署工程师 上海 正式 智能制造 / 工业互联网 / 工业自动化 职位描述 - 负责将PyTorch/TensorFlow训练模型转换为嵌入式设备支持的格式(如ONNX、TensorRT、TFLite)- 在目标平台上集成和调优推理引擎(如TensorRT、OpenVINO、TFLite、RKNN),实现低延迟、高吞吐的算法推理。- 编写高性能C++代码,利用NEON/SIMD指令集、多线程、内存池等技术进行底层优化。- 使用Nsight Systems、vtune、perf等工具进行端到端性能剖析,精准定位性能瓶颈(算子、内存、IO等)。- 与硬件团队协作,充分利用NPU/DPU、DSP、GPU等异构计算单元的算力,设计高效的任务调度与数据流。- 构建自动化部署流水线,编写部署脚本,并进行严格的精度、速度、功耗和稳定性测试。 职位要求 必需条件 - 计算机、电子、自动化等相关专业研究生(985)及以上学历。- 精通C/C++,熟悉现代C++特性,具备高性能编程和内存优化能力。- 熟练使用Python,用于模型转换、测试和自动化脚本编写。- 精通ONNX及相关工具链,熟悉至少一种推理引擎(TensorRT、OpenVINO、TFLite等)。- 熟悉Linux开发环境,具备交叉编译、驱动、系统调优经验。- 有扎实的计算机体系结构基础,了解CPU缓存、内存管理等原理。优先考虑条件 1. 硬件架构专家:熟悉Jetson、海思、地平线、瑞芯微等主流AI芯片架构,有BSP开发经验。2. 编译器技术:了解TVM、MLIR等编译器技术,有相关经验者极具优势。3. 机器人/自动驾驶:熟悉ROS 2,有感知、规划、控制算法部署经验。4. 计算机视觉:有CV算法(目标检测、分割、分类)的极致优化经验。 投递...