-
Notifications
You must be signed in to change notification settings - Fork 31
flo_ai aws bedrock integration #228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,285 @@ | ||
| import json | ||
| import re | ||
| from typing import Dict, Any, List, AsyncIterator, Optional | ||
| import boto3 | ||
| import asyncio | ||
| from .base_llm import BaseLLM | ||
| from flo_ai.models.chat_message import ImageMessageContent | ||
| from flo_ai.tool.base_tool import Tool | ||
| from flo_ai.telemetry.instrumentation import ( | ||
| trace_llm_call, | ||
| trace_llm_stream, | ||
| llm_metrics, | ||
| add_span_attributes, | ||
| ) | ||
| from flo_ai.telemetry import get_tracer | ||
| from flo_ai.utils.logger import logger | ||
| from opentelemetry import trace | ||
|
|
||
|
|
||
| class AWSBedrock(BaseLLM): # Only openai compatible for now | ||
| def __init__( | ||
| self, | ||
| model: str = 'openai.gpt-oss-20b-1:0', | ||
| temperature: float = 0.7, | ||
| **kwargs, | ||
| ): | ||
| super().__init__(model=model, temperature=temperature, **kwargs) | ||
| self.boto_client = boto3.client('bedrock-runtime') | ||
| self.model = model | ||
| self.kwargs = kwargs | ||
|
|
||
| @staticmethod | ||
| def _strip_reasoning(text: str) -> str: | ||
| return re.sub(r'<reasoning>.*?</reasoning>', '', text, flags=re.DOTALL).strip() | ||
|
|
||
| def _convert_messages( | ||
| self, messages: list[dict], output_schema: dict | None = None | ||
| ) -> list[dict]: | ||
| result = [] | ||
|
|
||
| if output_schema: | ||
| result.append( | ||
| { | ||
| 'role': 'system', | ||
| 'content': f'Provide output in the following JSON schema:\n{json.dumps(output_schema, indent=2)}', | ||
| } | ||
| ) | ||
|
|
||
| for msg in messages: | ||
| if msg['role'] == 'function': | ||
| result.append( | ||
| { | ||
| 'role': 'tool', | ||
| 'tool_call_id': msg.get('tool_use_id', 'unknown'), | ||
| 'content': msg['content'], | ||
| 'name': msg.get('name', ''), | ||
| } | ||
| ) | ||
| else: | ||
| result.append(msg) | ||
|
|
||
| return result | ||
|
|
||
| @trace_llm_call(provider='bedrock') | ||
| async def generate( | ||
| self, | ||
| messages: list[dict], | ||
| functions: Optional[List[Dict[str, Any]]] = None, | ||
| output_schema: Optional[Dict[str, Any]] = None, | ||
| **kwargs, | ||
| ) -> Any: | ||
| converted = self._convert_messages(messages, output_schema) | ||
|
|
||
| request_options = {**self.kwargs, **kwargs} | ||
| request_body: Dict[str, Any] = { | ||
| 'model': self.model, | ||
| 'messages': converted, | ||
| 'temperature': self.temperature, | ||
| } | ||
| if 'max_tokens' in request_options: | ||
| request_body['max_completion_tokens'] = request_options['max_tokens'] | ||
| if functions: | ||
| request_body['tools'] = functions | ||
|
|
||
| response = await asyncio.to_thread( | ||
| self.boto_client.invoke_model, | ||
| modelId=self.model, | ||
| body=json.dumps(request_body), | ||
| ) | ||
| response_body = json.loads(response['body'].read().decode('utf-8')) | ||
|
Comment on lines
+64
to
+90
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🌐 Web query:
💡 Result: In Amazon Bedrock, “profiles” in the context of
Sources: 🌐 Web query:
💡 Result: In Amazon Bedrock Runtime, Key differences
Practical guidance
Sources: AWS “What’s New” (Converse API announcement) [1], Bedrock Runtime API refs for 🌐 Web query:
💡 Result:
Generic REST shape
Example request bodies (common ones)Anthropic Claude (Messages API format) {
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 1024,
"messages": [
{ "role": "user", "content": [ { "type": "text", "text": "Hello world" } ] }
]
}Amazon Titan Text (text generation) {
"inputText": "Write a short poem about apples.",
"textGenerationConfig": {
"maxTokenCount": 256,
"temperature": 0,
"topP": 0.9,
"stopSequences": []
}
}Amazon Titan Text Embeddings {
"inputText": "Text to embed"
}If you tell me the exact Citations:
🏁 Script executed: cd flo_ai && head -60 flo_ai/llm/aws_bedrock_llm.py | cat -nRepository: rootflo/wavefront Length of output: 2260 🏁 Script executed: cd flo_ai && rg -A 5 -B 5 "class.*Bedrock" flo_ai/llm/aws_bedrock_llm.pyRepository: rootflo/wavefront Length of output: 294 🏁 Script executed: cd flo_ai && rg "invoke_model\|InvokeModel\|Converse\|modelId" flo_ai/llm/aws_bedrock_llm.pyRepository: rootflo/wavefront Length of output: 43 🏁 Script executed: cd flo_ai && sed -n '63,90p' flo_ai/llm/aws_bedrock_llm.py | cat -nRepository: rootflo/wavefront Length of output: 1226 🏁 Script executed: cd flo_ai && rg -i "openai\|converse\|model.*compatibility\|supported.*model" flo_ai/llm/aws_bedrock_llm.pyRepository: rootflo/wavefront Length of output: 43 🏁 Script executed: cd flo_ai && grep -n "invoke_model\|ConversUrl" flo_ai/llm/aws_bedrock_llm.pyRepository: rootflo/wavefront Length of output: 174 🏁 Script executed: cd flo_ai && head -100 flo_ai/llm/aws_bedrock_llm.py | tail -40Repository: rootflo/wavefront Length of output: 1440 🏁 Script executed: cd flo_ai && grep -n "class\|def\|#" flo_ai/llm/aws_bedrock_llm.py | head -30Repository: rootflo/wavefront Length of output: 861 This implementation is coupled to OpenAI-compatible Bedrock models, not general Bedrock support. The request body uses OpenAI-style schema ( Users attempting to use native Bedrock models (e.g., Recommended fixes:
Also, the 🧰 Tools🪛 Ruff (0.15.2)[warning] 69-69: Unused method argument: (ARG002) 🤖 Prompt for AI Agents
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This looks likes a valid issue, maybe u can add a comment on whether its true There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: Length of output: 2124 🏁 Script executed: Length of output: 2027 🏁 Script executed: Length of output: 1008
Evidence from the implementation:
What breaks with native Bedrock models:
Recommendation: Either add a clear docstring/comment stating "This class only supports OpenAI-compatible Bedrock models (openai.*)" or rename to |
||
|
|
||
| usage = response_body.get('usage', {}) | ||
| if usage: | ||
| llm_metrics.record_tokens( | ||
| total_tokens=usage.get('total_tokens', 0), | ||
| prompt_tokens=usage.get('prompt_tokens', 0), | ||
| completion_tokens=usage.get('completion_tokens', 0), | ||
| model=self.model, | ||
| provider='bedrock', | ||
| ) | ||
| tracer = get_tracer() | ||
| if tracer: | ||
| add_span_attributes( | ||
| trace.get_current_span(), | ||
| { | ||
| 'llm.tokens.prompt': usage.get('prompt_tokens', 0), | ||
| 'llm.tokens.completion': usage.get('completion_tokens', 0), | ||
| 'llm.tokens.total': usage.get('total_tokens', 0), | ||
| }, | ||
| ) | ||
|
|
||
| choices = response_body.get('choices', []) | ||
| if not choices: | ||
| return {'content': '', 'raw_message': response_body} | ||
|
|
||
| message = choices[0].get('message', {}) | ||
| if 'content' in message and message['content']: | ||
| message['content'] = self._strip_reasoning(message['content']) | ||
| text_content = message.get('content', '') or '' | ||
| tool_call = None | ||
|
|
||
| tool_calls = message.get('tool_calls', []) | ||
| if tool_calls: | ||
| tc = tool_calls[0] | ||
| tool_call = { | ||
| 'name': tc['function']['name'], | ||
| 'arguments': tc['function']['arguments'], | ||
| 'id': tc['id'], | ||
| } | ||
|
|
||
| if tool_call: | ||
| return { | ||
| 'content': text_content, | ||
| 'function_call': tool_call, | ||
| 'raw_message': message, | ||
| } | ||
| return {'content': text_content, 'raw_message': message} | ||
|
|
||
| @trace_llm_stream(provider='bedrock') | ||
| async def stream( | ||
| self, | ||
| messages: List[Dict[str, Any]], | ||
| functions: Optional[List[Dict[str, Any]]] = None, | ||
| **kwargs: Any, | ||
| ) -> AsyncIterator[Dict[str, Any]]: | ||
| converted = self._convert_messages(messages) | ||
|
|
||
| request_options = {**self.kwargs, **kwargs} | ||
| request_body: Dict[str, Any] = { | ||
| 'model': self.model, | ||
| 'messages': converted, | ||
| 'temperature': self.temperature, | ||
| 'stream': True, | ||
| } | ||
| if 'max_tokens' in request_options: | ||
| request_body['max_completion_tokens'] = request_options['max_tokens'] | ||
| if functions: | ||
| request_body['tools'] = functions | ||
|
|
||
| response = await asyncio.to_thread( | ||
| self.boto_client.invoke_model_with_response_stream, | ||
| modelId=self.model, | ||
| body=json.dumps(request_body), | ||
| ) | ||
|
|
||
| queue: asyncio.Queue = asyncio.Queue() | ||
| loop = asyncio.get_running_loop() | ||
|
|
||
| def _iter_events(): | ||
| try: | ||
| for event in response['body']: | ||
| chunk_bytes = event.get('chunk', {}).get('bytes', b'') | ||
| if chunk_bytes: | ||
| loop.call_soon_threadsafe(queue.put_nowait, chunk_bytes) | ||
| except Exception as exc: | ||
| loop.call_soon_threadsafe(queue.put_nowait, exc) | ||
| finally: | ||
| loop.call_soon_threadsafe(queue.put_nowait, None) # sentinel | ||
|
|
||
| loop.run_in_executor(None, _iter_events) | ||
|
|
||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| while True: | ||
| chunk_bytes = await queue.get() | ||
| if isinstance(chunk_bytes, Exception): | ||
| raise chunk_bytes | ||
| if chunk_bytes is None: | ||
| break | ||
| text = chunk_bytes.decode('utf-8').strip() | ||
| content = None | ||
| try: | ||
| data = json.loads(text) | ||
| content = data.get('choices', [{}])[0].get('delta', {}).get('content') | ||
| except json.JSONDecodeError: | ||
github-code-quality[bot] marked this conversation as resolved.
Fixed
Show fixed
Hide fixed
|
||
| # Not valid JSON, try SSE format below | ||
| for line in text.split('\n'): | ||
| line = line.strip() | ||
| if line.startswith('data: ') and line != 'data: [DONE]': | ||
| try: | ||
| data = json.loads(line[6:]) | ||
| content = ( | ||
| data.get('choices', [{}])[0] | ||
| .get('delta', {}) | ||
| .get('content') | ||
| ) | ||
| if content: | ||
| clean = self._strip_reasoning(content) | ||
| if clean: | ||
| yield {'content': clean} | ||
| content = None | ||
| except json.JSONDecodeError: | ||
| logger.debug('Skipping malformed SSE line: %s', line) | ||
| if content: | ||
| clean = self._strip_reasoning(content) | ||
| if clean: | ||
| yield {'content': clean} | ||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| def get_message_content(self, response: Dict[str, Any]) -> str: | ||
| content = ( | ||
| response.get('content', '') if isinstance(response, dict) else str(response) | ||
| ) | ||
| return self._strip_reasoning(content) | ||
|
|
||
| def format_tool_for_llm(self, tool: 'Tool') -> Dict[str, Any]: | ||
| return { | ||
| 'type': 'function', | ||
| 'function': { | ||
| 'name': tool.name, | ||
| 'description': tool.description, | ||
| 'parameters': { | ||
| 'type': 'object', | ||
| 'properties': { | ||
| name: { | ||
| 'type': info.get('type', 'string'), | ||
| 'description': info.get('description', ''), | ||
| **( | ||
| {'items': info['items']} | ||
| if info.get('type') == 'array' and 'items' in info | ||
| else {} | ||
| ), | ||
| } | ||
| for name, info in tool.parameters.items() | ||
| }, | ||
| 'required': [ | ||
| name | ||
| for name, info in tool.parameters.items() | ||
| if info.get('required', True) | ||
| ], | ||
| }, | ||
| }, | ||
| } | ||
|
|
||
| def format_tools_for_llm(self, tools: List['Tool']) -> List[Dict[str, Any]]: | ||
| return [self.format_tool_for_llm(tool) for tool in tools] | ||
|
|
||
| def format_image_in_message(self, image: ImageMessageContent) -> dict: | ||
| if image.base64: | ||
| return { | ||
| 'type': 'image_url', | ||
| 'image_url': { | ||
| 'url': f'data:{image.mime_type or "image/jpeg"};base64,{image.base64}' | ||
| }, | ||
| } | ||
| raise NotImplementedError( | ||
| 'AWS Bedrock LLM requires image base64 data to format image content.' | ||
| ) | ||
|
|
||
| def get_assistant_message_for_tool_call( | ||
| self, response: Dict[str, Any] | ||
| ) -> Optional[Any]: | ||
| if isinstance(response, dict) and 'raw_message' in response: | ||
| return response['raw_message'] | ||
| return None | ||
|
|
||
| def get_tool_use_id(self, function_call: Dict[str, Any]) -> Optional[str]: | ||
| return function_call.get('id') | ||
|
|
||
| def format_function_result_message( | ||
| self, function_name: str, content: str, tool_use_id: Optional[str] = None | ||
| ) -> Dict[str, Any]: | ||
| return { | ||
| 'role': 'tool', | ||
| 'tool_call_id': tool_use_id or 'unknown', | ||
| 'content': content, | ||
| 'name': function_name, | ||
| } | ||
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: rootflo/wavefront
Length of output: 302
🏁 Script executed:
Repository: rootflo/wavefront
Length of output: 1890
🏁 Script executed:
Repository: rootflo/wavefront
Length of output: 4198
🏁 Script executed:
Repository: rootflo/wavefront
Length of output: 3205
🏁 Script executed:
Repository: rootflo/wavefront
Length of output: 120
🏁 Script executed:
Repository: rootflo/wavefront
Length of output: 305
Add
boto3to production dependencies and fix suspicious default model ID.boto3is missing from the productiondependencieslist inpyproject.toml. It's currently only in the[dependency-groups] devsection. Users installing this package will get anImportErrorat runtime when trying to useAWSBedrock. Add"boto3>=1.36.1,<2"to the maindependencieslist.The default model
'openai.gpt-oss-20b-1:0'does not match AWS Bedrock model ID patterns (e.g.,anthropic.claude-3-sonnet-20240229-v1:0,amazon.titan-text-express-v1). This appears to be a placeholder or incorrect value. Provide a valid Bedrock model ID or document why this unusual format is intentional.Minor:
self.modelandself.kwargson lines 28–29 are redundantly assigned;BaseLLM.__init__()already sets these (harmless but unnecessary).🤖 Prompt for AI Agents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 you need to add, the current code works because some of our libraries have boto dependency internally
3. Check 3 as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Length of output: 1513
@vizsatiz, confirmed on both points:boto3 dependency: You're right that it should be explicitly added to the production
dependencieslist inpyproject.toml, even though it currently works through transitive dependencies. Add:"boto3>=1.36.1,<2"Redundant assignments on lines 28-29: Confirmed redundant.
BaseLLM.__init__()already sets bothself.modelandself.kwargs(lines 17 and 20 in base_llm.py). These lines can be safely removed:✏️ Learnings added