-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Nishan Jain - Technical Writer Task (RAG and Text-to-SQL) #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis pull request introduces the foundation for an AI-powered SQL-RAG query agent. It adds documentation and dependency declarations along with implementation files. The new Streamlit application integrates SQL database queries with semantic search via LlamaCloud and OpenAI’s GPT models. Additionally, a custom router agent workflow is defined using event classes and asynchronous methods to manage tool calls and chat interactions. Environment setup, error handling, and query processing flows are clearly structured across the new files. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant UI as "Streamlit UI"
participant App as "SQL-RAG App"
participant DB as "SQLite DB"
participant Workflow as "RouterWorkflow"
participant Llama as "LlamaCloud/LLM"
User->>UI: Enter query
UI->>App: Invoke main() & initialize app
App->>DB: Set up in-memory SQLite DB
App->>App: Retrieve API keys & setup logging
App->>Llama: Initialize LlamaCloud query engine
App->>Workflow: Initialize router workflow
User->>App: Submit query
App->>Workflow: Run query asynchronously
Workflow->>Llama: Process query via chat method
Llama-->>Workflow: Return response
Workflow-->>App: Return final result
App-->>UI: Display response
sequenceDiagram
participant User
participant Workflow as "Router Workflow"
participant LLM as "Language Model"
participant Tool as "Query Tool"
User->>Workflow: Provide query
Workflow->>LLM: Prepare chat & validate input
LLM-->>Workflow: Chat response (with possible tool calls)
alt Tool call detected
Workflow->>Tool: Dispatch tool call
Tool-->>Workflow: Return tool result
Workflow->>LLM: Append result & continue chat
else No tool call
Workflow-->>User: Return final message
end
Poem
Tip ⚡🧪 Multi-step agentic review comment chat (experimental)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (11)
Task1-sql-router-assessment/requirements.txt (1)
4-4
: Remove unintended spacing in version specifier.There's an extra space after "
==
" in line 4 for thesqlalchemy
package, which might cause issues in some parsing contexts.-sqlalchemy== 2.0.39 +sqlalchemy==2.0.39Task1-sql-router-assessment/Readme.md (2)
79-79
: Use a Markdown hyperlink instead of a bare URL.Avoid using bare URLs to comply with Markdown linting best practices.
-https://typefully.com/t/Vkxs861 +[Typefully Link](https://typefully.com/t/Vkxs861)🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
79-79: Bare URL used
null(MD034, no-bare-urls)
82-82
: Consider rephrasing "feel free to" for a more professional tone.“Feel free to” is often overused. Replacing it can polish your wording.
-Contributions are welcome! Feel free to fork the repo and submit a pull request. +Contributions are welcome! You can fork the repo and submit a pull request.🧰 Tools
🪛 LanguageTool
[style] ~82-~82: The phrase ‘feel free to’ is used quite frequently. Consider using a less frequent alternative to set your writing apart from others and make it sound more professional.
Context: ...Contributing Contributions are welcome! Feel free to fork the repo and submit a pull request...(FEEL_FREE_TO_STYLE_ME)
Task1-sql-router-assessment/sql_rag.py (2)
41-41
: Raise exceptions using the original chain for better traceback context.When re-raising exceptions, it's preferable to use
raise ... from e
to preserve the original traceback chain.Here’s a sample fix:
- except Exception as e: - raise RuntimeError(f"Failed to initialize LlamaCloudIndex: {e}") + except Exception as e: + raise RuntimeError(f"Failed to initialize LlamaCloudIndex: {e}") from e ... - except Exception as e: - raise RuntimeError(f"Failed to set up database: {e}") + except Exception as e: + raise RuntimeError(f"Failed to set up database: {e}") from e ... - except Exception as e: - raise RuntimeError(f"Failed to setup SQL query engine: {e}") + except Exception as e: + raise RuntimeError(f"Failed to setup SQL query engine: {e}") from e ... - except Exception as e: - raise RuntimeError(f"Failed to initialize workflow: {e}") + except Exception as e: + raise RuntimeError(f"Failed to initialize workflow: {e}") from e ... - except Exception as e: - raise RuntimeError(f"Query execution failed: {e}") + except Exception as e: + raise RuntimeError(f"Query execution failed: {e}") from eAlso applies to: 72-72, 86-86, 94-94, 102-102
🧰 Tools
🪛 Ruff (0.8.2)
41-41: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
130-130
: Consider asynchronous Streamlit patterns if concurrency becomes necessary.You’re calling
asyncio.run(run_query(...))
within a button callback. This is fine for simple use, but if you need additional concurrency, explore non-blocking approaches or Streamlit’s async patterns in the future.Task1-sql-router-assessment/tool_workflow.py (6)
1-2
: Use explicit Python version union syntax or future import if targeting Python < 3.10.The file uses Python 3.10+ union syntax (e.g.,
GatherToolsEvent | StopEvent
). If this code must run under earlier Python versions, consider either importingannotations
from__future__
or usingUnion[GatherToolsEvent, StopEvent]
.
16-18
: Event subclass defined concisely.The
InputEvent
class is straightforward, but consider adding typed fields or docstrings if you anticipate storing data inInputEvent
later.
20-24
: GatherToolsEvent usage.
tool_calls
is typed asAny
. If possible, refine the type to something more explicit to improve clarity (e.g.,List[ToolSelection]
), making the code more self-documenting.
41-48
: Consider clarifying parameter docstrings.The
__init__
signature is thorough, but documenting each parameter’s purpose (e.g.,tools
,timeout
,disable_validation
, etc.) with docstrings could help other developers understand usage more quickly.
58-62
: Reset method clarity.
reset
simply clears chat history. Consider resetting other session-related attributes (like thetools_dict
or partial states) if relevant to the workflow’s state machine.
63-73
: Validate message existence.You correctly raise an error when
'message'
is missing and then store user input in chat history. No immediate issues, but you may want to handle empty messages or sanitize user inputs if security is a concern.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
Task1-sql-router-assessment/Readme.md
(1 hunks)Task1-sql-router-assessment/requirements.txt
(1 hunks)Task1-sql-router-assessment/sql_rag.py
(1 hunks)Task1-sql-router-assessment/tool_workflow.py
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
Task1-sql-router-assessment/Readme.md
[grammar] ~63-~63: Please check the verb form.
Context: ...y via the selected engine. ## Usage 1. Open the Streamlit UI. 2. Enter a query, e.g...
(SHE_LIVE)
[style] ~82-~82: The phrase ‘feel free to’ is used quite frequently. Consider using a less frequent alternative to set your writing apart from others and make it sound more professional.
Context: ...Contributing Contributions are welcome! Feel free to fork the repo and submit a pull request...
(FEEL_FREE_TO_STYLE_ME)
🪛 markdownlint-cli2 (0.17.2)
Task1-sql-router-assessment/Readme.md
79-79: Bare URL used
null
(MD034, no-bare-urls)
🪛 Ruff (0.8.2)
Task1-sql-router-assessment/sql_rag.py
41-41: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
72-72: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
86-86: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
94-94: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
102-102: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
🔇 Additional comments (11)
Task1-sql-router-assessment/requirements.txt (1)
1-6
: All dependencies are pinned and look consistent.Having pinned versions helps ensure reproducible installs, and the listed dependencies align with the rest of the project requirements.
Task1-sql-router-assessment/Readme.md (2)
63-63
: Grammar check appears fine; no changes necessary.The grammar in line 63 is acceptable, and there's no clear verb form issue.
🧰 Tools
🪛 LanguageTool
[grammar] ~63-~63: Please check the verb form.
Context: ...y via the selected engine. ## Usage 1. Open the Streamlit UI. 2. Enter a query, e.g...(SHE_LIVE)
1-87
: Comprehensive and well-structured documentation.This README effectively covers the setup, usage, and architecture. No major concerns regarding clarity or consistency.
🧰 Tools
🪛 LanguageTool
[grammar] ~63-~63: Please check the verb form.
Context: ...y via the selected engine. ## Usage 1. Open the Streamlit UI. 2. Enter a query, e.g...(SHE_LIVE)
[style] ~82-~82: The phrase ‘feel free to’ is used quite frequently. Consider using a less frequent alternative to set your writing apart from others and make it sound more professional.
Context: ...Contributing Contributions are welcome! Feel free to fork the repo and submit a pull request...(FEEL_FREE_TO_STYLE_ME)
🪛 markdownlint-cli2 (0.17.2)
79-79: Bare URL used
null(MD034, no-bare-urls)
Task1-sql-router-assessment/sql_rag.py (1)
1-143
: Overall, this integration is well-designed and clear.The code connects SQL and LlamaIndex seamlessly under Streamlit, with robust error handling and logical structure.
🧰 Tools
🪛 Ruff (0.8.2)
41-41: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
72-72: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
86-86: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
94-94: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
102-102: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
Task1-sql-router-assessment/tool_workflow.py (7)
3-15
: Imports look appropriate.You are importing classes and types from
llama_index
. Ensure all these dependencies are correctly installed to avoid runtime errors. Otherwise, no issues here.
26-30
: ToolCallEvent type suggestion.
tool_call
is strongly typed asToolSelection
, good job. If you expect additional fields or clarifications, consider adding them here for better maintainability.
32-36
: ToolCallEventResult type usage.The
msg
field is typed asChatMessage
, which is consistent. Make sure you always attach relevant context inadditional_kwargs
for traceability, if needed.
38-40
: High-level workflow class.The class-level docstring nicely encapsulates the purpose of the workflow.
49-57
: Tool dictionary creation and fallback LLM.
- Good approach to create a dictionary keyed by tool names.
self.llm
falling back toOpenAI
is convenient, but be cautious of environment variables or credentials that might be required.
74-97
: Conditional stop event logic.
- Code uses
achat_with_tools
to retrieve tool calls.- If no tool calls exist, it returns a
StopEvent
immediately.- This is a neat approach to short-circuit when no tool calls are needed.
No critical issues noted; just confirm that the short-circuit scenario is expected behavior in your user flow.
138-151
: Chat history accumulation and re-loop.
- Collecting tool events and appending messages is clear.
- The final return of
InputEvent()
restarts the loop, ensuring the workflow cycles again.- Confirm that the agent truly requires re-entry after tool invocation rather than finalizing if no further user input is expected.
Otherwise, looks good.
@step(pass_context=True) | ||
async def dispatch_calls(self, ctx: Context, ev: GatherToolsEvent) -> ToolCallEvent: | ||
"""Dispatches calls.""" | ||
|
||
tool_calls = ev.tool_calls | ||
await ctx.set("num_tool_calls", len(tool_calls)) | ||
|
||
# trigger tool call events | ||
for tool_call in tool_calls: | ||
ctx.send_event(ToolCallEvent(tool_call=tool_call)) | ||
|
||
return None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return type annotation mismatch.
The method signature indicates -> ToolCallEvent
, but the function returns None
. This mismatch can cause confusion or break type-checking. Either adjust the return type or return a ToolCallEvent
(or a list of them).
-async def dispatch_calls(self, ctx: Context, ev: GatherToolsEvent) -> ToolCallEvent:
+async def dispatch_calls(self, ctx: Context, ev: GatherToolsEvent) -> None:
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@step(pass_context=True) | |
async def dispatch_calls(self, ctx: Context, ev: GatherToolsEvent) -> ToolCallEvent: | |
"""Dispatches calls.""" | |
tool_calls = ev.tool_calls | |
await ctx.set("num_tool_calls", len(tool_calls)) | |
# trigger tool call events | |
for tool_call in tool_calls: | |
ctx.send_event(ToolCallEvent(tool_call=tool_call)) | |
return None | |
@step(pass_context=True) | |
async def dispatch_calls(self, ctx: Context, ev: GatherToolsEvent) -> None: | |
"""Dispatches calls.""" | |
tool_calls = ev.tool_calls | |
await ctx.set("num_tool_calls", len(tool_calls)) | |
# trigger tool call events | |
for tool_call in tool_calls: | |
ctx.send_event(ToolCallEvent(tool_call=tool_call)) | |
return None |
@step() | ||
async def call_tool(self, ev: ToolCallEvent) -> ToolCallEventResult: | ||
"""Calls tool.""" | ||
|
||
tool_call = ev.tool_call | ||
|
||
# get tool ID and function call | ||
id_ = tool_call.tool_id | ||
|
||
if self._verbose: | ||
print(f"Calling function {tool_call.tool_name} with msg {tool_call.tool_kwargs}") | ||
|
||
# call function and put result into a chat message | ||
tool = self.tools_dict[tool_call.tool_name] | ||
output = await tool.acall(**tool_call.tool_kwargs) | ||
msg = ChatMessage( | ||
name=tool_call.tool_name, | ||
content=str(output), | ||
role="tool", | ||
additional_kwargs={ | ||
"tool_call_id": id_, | ||
"name": tool_call.tool_name | ||
} | ||
) | ||
|
||
return ToolCallEventResult(msg=msg) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Potential KeyError for missing tools.
tool = self.tools_dict[tool_call.tool_name]
will raise KeyError
if no matching tool exists. Consider handling this with a graceful error message or fallback logic.
+if tool_call.tool_name not in self.tools_dict:
+ raise RuntimeError(f"Tool '{tool_call.tool_name}' not found.")
tool = self.tools_dict[tool_call.tool_name]
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@step() | |
async def call_tool(self, ev: ToolCallEvent) -> ToolCallEventResult: | |
"""Calls tool.""" | |
tool_call = ev.tool_call | |
# get tool ID and function call | |
id_ = tool_call.tool_id | |
if self._verbose: | |
print(f"Calling function {tool_call.tool_name} with msg {tool_call.tool_kwargs}") | |
# call function and put result into a chat message | |
tool = self.tools_dict[tool_call.tool_name] | |
output = await tool.acall(**tool_call.tool_kwargs) | |
msg = ChatMessage( | |
name=tool_call.tool_name, | |
content=str(output), | |
role="tool", | |
additional_kwargs={ | |
"tool_call_id": id_, | |
"name": tool_call.tool_name | |
} | |
) | |
return ToolCallEventResult(msg=msg) | |
@step() | |
async def call_tool(self, ev: ToolCallEvent) -> ToolCallEventResult: | |
"""Calls tool.""" | |
tool_call = ev.tool_call | |
# get tool ID and function call | |
id_ = tool_call.tool_id | |
if self._verbose: | |
print(f"Calling function {tool_call.tool_name} with msg {tool_call.tool_kwargs}") | |
# call function and put result into a chat message | |
if tool_call.tool_name not in self.tools_dict: | |
raise RuntimeError(f"Tool '{tool_call.tool_name}' not found.") | |
tool = self.tools_dict[tool_call.tool_name] | |
output = await tool.acall(**tool_call.tool_kwargs) | |
msg = ChatMessage( | |
name=tool_call.tool_name, | |
content=str(output), | |
role="tool", | |
additional_kwargs={ | |
"tool_call_id": id_, | |
"name": tool_call.tool_name | |
} | |
) | |
return ToolCallEventResult(msg=msg) |
This LlamaIndex-based application is an AI-powered query agent that retrieves data from both a database and a file-based system using LlamaIndex and OpenAI's GPT models. It utilizes an agent router to determine the appropriate source for answering user queries.
Summary by CodeRabbit
New Features
Documentation
Chores