Skip to content

ticket #3 cs-assistant: Added :stats and verbose mode to the dev CLI#10

Open
strictshot wants to merge 1 commit into
mainfrom
feat/adding-stats
Open

ticket #3 cs-assistant: Added :stats and verbose mode to the dev CLI#10
strictshot wants to merge 1 commit into
mainfrom
feat/adding-stats

Conversation

@strictshot

Copy link
Copy Markdown

Added two dev conveniences (:stats and :verbose).
Key changes to files :

  • src/infrastructure/db/repository.py
  • src/apps/dev_cli.py
  • README.md (in the Using It section)

Manual verification done:

  1. Ran make migrate && make ingest to populate the local database.
  2. Started the CLI using make cli and verified the startup message.
  3. Executed `:stats
  4. Toggled :verbose mode on, submitted a query, and confirmed that the source URL, similarity scores, and character-limited snippets printed cleanly prior to the model's response.
  5. Checked that exit and quit still terminate the session cleanly.

Comment thread src/apps/dev_cli.py
log = get_logger(__name__)


async def _check_db() -> None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function could be refactored a bit. Since both startup and the :stats command are meant to do the same thing (print number of sources and chunks, warn if DB is empty), replace this with a helper that does both.

Call it something like _print_db_status().

Get the counts in this function the way you do in the stats command and print them:
count_sources, count_chunks = await Repository.get_source_and_chunk_counts(session)

From that query, you already get the count of chunks so you can do a simple if count_chunks == 0: check to see if the warning needs to be printed, instead of using the Repository.has_chunks method which makes a second db query even though we already have the info we need (number of chunks) from the first query. At our scale the second db query isn't really expensive, but still unnecessary to make since we need the counts regardless and that tells us if the count is 0 or not, so I think this way would be cleaner.

I think this might leave the has_chunks method with no uses, but don't remove it since it might be worth using somewhere else in the future. It can be cleaned up later if there's really no use for it.

Comment thread src/apps/dev_cli.py

# :stats cmd
if question.lower() in {":stats"}:
async with async_session_factory() as session:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the shared helper based on the comment on line 12 has been implemented, this whole section (lines 37-40) could be replaced with the helper.

Same benefit, avoids making 2 queries when just one would be enough.

return result.scalar_one_or_none() is not None

@staticmethod
async def get_source_and_chunk_counts(session: AsyncSession) -> tuple[int, int]:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods currently don't have tests, even though all other repository methods have tests implemented. For consistency, implement unit tests for these in tests/infrastructure/db/test_repository.py.

Follow the pattern / conventions of the other tests in that file. Run tests before committing to verify behaviour and that they pass.

@staticmethod
async def count_chunks(session: AsyncSession) -> int:
result = await session.execute(select(func.count(ChunkRow.id)))
# if above doesn't work properly

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 26-27 look like temporary code that isn't needed since they're commented out? If yes, please remove them from the final PR.

Comment thread README.md
While inside the interactive CLI (`ask>`), you can use the following commands to control the session and view metadata:
| Command | Description |
| :--- | :--- |
| `:stats` | Toggles the display of performance metrics (e.g., token count, response time) for subsequent prompts. |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These descriptions don't fully match what those CLI commands are doing (e.g. stats doesn't show response time, verbose doesn't show "API interactions"). Update the descriptions so it's more clear exactly what the commands do, something like
:stats - Prints how many sources and chunks are currently loaded in the database.
:verbose - Toggles verbose mode: for each question, prints the retrieved chunks (source URL, similarity score, and a content snippet) before the answer.

Comment thread src/apps/dev_cli.py
source_url = chunk_item.chunk.source_url
similarity_score = chunk_item.score
snippet = " ".join(
(chunk_item.chunk.content.split())[:250]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snippet should infact be 250 characters, not words. Just replace with snippet = chunk_item.chunk.content[:250] and remove the comment below

Comment thread src/apps/dev_cli.py
@@ -19,13 +20,33 @@

async def _repl() -> None:
await _check_db()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dev CLI startup currently doesn't print the source and chunk counts, which is expected behaviour based on ticket description. Once the shared helper from the comment on line 12 has been implemented, that can be used here to both print the counts and an empty DB warning if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants