Source URL: https://cloud.google.com/blog/topics/retail/ai-for-retailers-boost-roi-without-straining-budget-or-resources/
Source: Cloud Blog
Title: How inference at the edge unlocks new AI use cases for retailers
Feedly Summary: For retailers, making intelligent, data-driven decisions in real-time isn’t an advantage — it’s a necessity. Staying ahead of the curve means embracing AI, but many retailers hesitate to adopt because it’s costly to overhaul their technology. While traditional AI implementations may require significant upfront investments, retailers can leverage existing assets to harness the power of AI.
These assets, ranging from security cameras to point-of-sale systems, can unlock store analytics, faster transactions, staff enablement, loss prevention, and personalization — all without straining the budget. In this post, we’ll explore how inference at the edge, a technique that runs AI-optimized applications on local devices without relying on distant cloud servers, can transform retail assets into powerful tools.
aside_block
How retailers can build an AI foundation
Retailers can find assets to fuel their AI in all corners of the business. You can unlock employee productivity by transforming your vast repository of handbooks, training materials, and operational procedures into working assets for AI.
Digitized manuals for store equipment, human resources, loss prevention, and domain-specific information can also be combined with agent-based AI assistants to provide contextually aware “next action assistants”. By extending AI optimized applications from the cloud to the edge, retail associates can now ask their AI assistant, “What do I do next?” with a detailed and fast response tailored to the retail associate’s question.
Edge processing power decision point: CPU vs GPU
Next, we’ll explore the critical decision on the right hardware to power your applications. The two primary options are CPUs (Central Processing Units) and GPUs (Graphics Processing Units), each with its own strengths and weaknesses. Making the informed choice requires understanding your specific use cases and balancing performance requirements, bandwidth, and model processing with cost considerations. Consider this chart to guide your decision-making process, especially when choosing between deploying at a regional DC or at the edge.
Decision matrix (chart):
Feature
CPU
GPU
Use cases (examples)
Cost
Lower
Higher
Basic analytics, people counting, simple object detection
Performance
Required; Good for general-purpose tasks
Optional; Good for parallel processing
Complex AI, video analytics, high-resolution image processing, ML model training
Power consumption
Lower
Higher
Remote locations, small form-factor devices
Latency
Moderate
Lower (for parallel tasks)
Real-time applications, immediate insights
Deployment location
Edge or Regional DC
Typically Edge, but feasible in Regional DC
Determined by latency, bandwidth, and data processing needs
Key decision criteria for retail decision makers
Complexity of AI models: Retail use case focused AI models, like basic object detection, can often run efficiently on CPUs. More complex models, such as those used for real-time video analytics or personalized recommendations with large datasets, typically require the parallel processing power of GPUs.
Data volume and velocity: If you’re processing large amounts of data at high speed, a GPU may be necessary to keep up with the demand. For smaller datasets and lower throughput, a CPU may suffice.
Latency requirements: For use cases requiring ultra-low latency, such as real-time fraud detection, GPUs can provide faster processing, especially when located at the edge, closer to the data source. However, network latency between the edge and a regional DC might negate this benefit if the GPU is located regionally.
Budget: GPUs usually have a higher price tag than CPUs. Carefully consider your budget and the potential ROI of investing in GPU-powered solutions before making a decision. Start with CPU-based solutions where possible and upgrade to GPUs only when absolutely necessary.
Power consumption: GPUs generally consume more power than CPUs. This is an important factor to consider for edge deployments, especially in locations with limited power availability. This is less of a concern if deploying at a regional DC where power and cooling are centralized.
Deployment location: The proximity of the processing power to the data source has major implications for latency. Deploying at the edge (in-store) minimizes latency for real-time use cases. Regional DCs introduce network latency, making them less suitable for applications requiring immediate action. However, certain tasks requiring heavy compute but not low latency (e.g., nightly inventory analysis) might be better suited for a regional DC where resources can be pooled and managed centrally.
Remember, not all AI and ML require new investments in emerging technology. Many AI/ML based use cases can produce the desired outcome without using a GPU. For example, consider visual inspection for storage analytics and fast check out referenced in the Google Distributed Cloud Price-a-Tray interactive game. The inference is performed at 5FPS, while the video stream continues to run at 25FPS. The bounding boxes are then drawn on top of the returned information rather than having one system perform the video stream, detection and bounding boxes. This enables more efficient use of the CPU since many of the actions in this example can be split across cores and threads.
But there are cases when GPUs do make sense. When very high precision is required, GPUs are often needed as the drop in fidelity to quantize a model may reduce the quality beyond acceptable thresholds. In the example of tracking an item, if millimeter movement accuracy is required, 5FPS would not be sufficient on a reasonably fast moving item and a GPU would likely be required.
There is a middle between GPUs and CPUs—the world of speciality accelerators. Accelerators come in the form of peripherals to a system or as special instruction sets to a CPU. CPUs are being manufactured with advanced matrix multiplication math assisting tensor manipulation on-chip, greatly improving performance of ML and AI models. One concrete example is running models compiled for OpenVINO. In addition, Google Distributed Cloud (GDC) Server and Rack editions utilize Intel Core processors, an architecture designed to be more flexible, supporting matrix math improving the performance of ML models on CPU over traditional ML model service serving.
Bring AI to your business
By tapping into the power of existing infrastructure and deploying AI at the edge, retailers can deliver modern customer experiences, streamline operations, and unlock employee productivity.
Learn more about how to transform your retail brand with Google Distributed Cloud.
AI Summary and Description: Yes
Summary: The text discusses how retailers can leverage existing assets, including edge computing and AI technologies, to optimize operations, employee productivity, and customer experiences without substantial new investments.
Detailed Description: The content explores several major points regarding the use of AI in retail, emphasizing the following insights:
– **Edge Computing**: It highlights the importance of inference at the edge, allowing AI applications to run on local devices. This reduces the need for constant connectivity to distant cloud servers and minimizes latency in decision-making processes.
– **Existing Assets**: Retailers can utilize current technologies such as security cameras and point-of-sale systems to gather analytics, enhance loss prevention, and personalize customer experiences, thereby maximizing their existing investments.
– **AI-Optimized Applications**: The article discusses the implementation of AI assistant tools that can help employees perform their tasks more efficiently by providing real-time, context-aware responses tailored to specific queries.
– **Hardware Choices**: It outlines the decision-making process involved in selecting the right hardware (CPU vs. GPU) for different AI applications, stressing the significance of understanding use cases, cost implications, and performance requirements.
– **Key Decision Criteria**: Emphasizes important factors for retail decision-makers:
– **Complexity of AI Models**: Identifying whether a CPU is adequate for simpler tasks or if a GPU is necessary for complex processing.
– **Data Volume and Velocity**: Recognizing when high-speed data processing requires GPU support.
– **Latency Requirements**: Considering deployment location’s impact on application performance.
– **Budget Considerations**: Weighing the cost of GPU investments against expected return on investment.
– **Power Consumption**: Acknowledging the energy demands of different hardware solutions, especially in edge deployments.
– **Specialty Accelerators**: Introduces advanced processor capabilities designed for machine learning, which improve efficiency outside of traditional CPU/LGPU frameworks.
– **Call to Action for Retailers**: Encourages retailers to adopt AI and leverage their existing infrastructure to enhance customer interactions and streamline operations.
By focusing on the emerging trends in AI implementation and hardware optimization, this content is particularly relevant to professionals in the realms of AI, cloud computing, and infrastructure security, indicating practical applications and avenues for improvement within the retail sector.