Simon Willison’s Weblog: Claude 3.5 Haiku price drops by 20%

Source URL: https://simonwillison.net/2024/Dec/5/claude-35-haiku-price-drops-by-20/#atom-everything
Source: Simon Willison’s Weblog
Title: Claude 3.5 Haiku price drops by 20%

Feedly Summary: Claude 3.5 Haiku price drops by 20%
Buried in this otherwise quite dry post about Anthropic’s ongoing partnership with AWS:

To make this model even more accessible for a wide range of use cases, we’re lowering the price of Claude 3.5 Haiku to $0.80 per million input tokens and $4 per million output tokens across all platforms.

The previous price was $1/$5. I’ve updated my LLM pricing calculator and modified yesterday’s piece comparing prices with Amazon Nova as well.
Confusing matters somewhat, the article also announces a new way to access Claude 3.5 Haiku at the old price but with “up to 60% faster inference speed":

This faster version of Claude 3.5 Haiku, powered by Trainium2, is available in the US East (Ohio) Region via cross-region inference and is offered at $1 per million input tokens and $5 per million output tokens.

Using "cross-region inference" involve sending something called an "inference profile" to the Bedrock API. I have an open issue to figure out what that means for my llm-bedrock plugin.
Tags: anthropic, claude, generative-ai, llm-pricing, aws, ai, llms

AI Summary and Description: Yes

Summary: The text discusses a significant price drop for the Claude 3.5 Haiku language model by Anthropic, enhancing its accessibility for diverse use cases. It also mentions improved inference speeds through a new access method provided via AWS, which presents implications for cost-efficiency and operational speeds in AI deployments.

Detailed Description:
The announcement concerning the Claude 3.5 Haiku reflects notable updates in the pricing and performance of large language models (LLMs), particularly relevant to professionals engaged in AI and cloud infrastructure:

* **Pricing Updates**:
– The price for Claude 3.5 Haiku has been reduced significantly from $1.00 to $0.80 per million input tokens, and from $5.00 to $4.00 per million output tokens.
– This reduction is positioned to make the model more accessible for various applications, indicating a competitive strategy in the LLM market.

* **Performance Enhancements**:
– A new access method has been introduced that allows users to access Claude 3.5 Haiku at the previous price points but with up to 60% faster inference speeds.
– This new version utilizes Trainium2 infrastructure, which emphasizes the integration of advanced hardware to improve model performance.

* **Technical Considerations**:
– The mention of “cross-region inference” implies that the inference profiles within the AWS environment entail additional complexity, suggesting a requirement for developers to understand and manage inter-region data transfers effectively.
– The reference to an “inference profile” and its implications for the llm-bedrock plugin highlights ongoing development needs to ensure compatibility and efficiency for developers utilizing these models.

This development in LLM pricing and capabilities is significant as it enhances not only the affordability of deploying AI solutions but also the operational efficiency, which are critical considerations for AI, cloud computing, and associated security practices.