Skip to content

LCORE-2860: RHOAI Prow e2e tests fail to start — service pods never become ready#2050

Draft
are-ces wants to merge 4 commits into
lightspeed-core:mainfrom
are-ces:lcore-2860-fix-rhoai-pipeline
Draft

LCORE-2860: RHOAI Prow e2e tests fail to start — service pods never become ready#2050
are-ces wants to merge 4 commits into
lightspeed-core:mainfrom
are-ces:lcore-2860-fix-rhoai-pipeline

Conversation

@are-ces

@are-ces are-ces commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Fix two issues preventing RHOAI Prow e2e pipeline from deploying service pods:

  1. Image build failure (DockerBuildFailed): oc new-build --image=ubi9/ubi-minimal overrode the containerfile's FROM ubi9/python-312 with ubi-minimal, which lacks dnf. Removed the --image flag so the containerfile's own base image is used.

  2. Vector store registration conflict: run.yaml pre-registered the vector store and embedding model under provider_id: faiss, conflicting with the lightspeed-stack enrichment script which registers them under provider_id: byok_e2e-test-docs. Removed the duplicate entries since enrichment handles both.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

  • Assisted-by: Claude Opus 4.6
  • Generated by: N/A

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  1. Triggered RHOAI Prow periodic pipeline — llama-stack image builds successfully with ubi9/python-312 base
  2. llama-stack pod starts without vector store registration conflict
  3. lightspeed-stack connects to llama-stack successfully
  4. Verified RAG query returns results via /v1/query

Summary by CodeRabbit

  • Chores
    • Simplified end-to-end test and build setup by removing an unused vector store registration and streamlining the image build configuration.
    • No user-facing functionality changes were introduced.

Remove --image flag from oc new-build that was overriding the
containerfile's FROM (ubi9/python-312) with ubi-minimal, which lacks
dnf. Remove duplicate vector_stores and embedding model registration
from run.yaml since the lightspeed-stack enrichment script handles
both.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 26d515a7-b47f-45d0-a87b-c2f44163c8bd

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

Modifies the RHOAI e2e test configuration to remove an embedding model and FAISS vector store registration, replacing it with an empty vector store list, and removes an explicit base image argument from the pipeline's build step.

Changes

E2E RHOAI Test Config and Pipeline Updates

Layer / File(s) Summary
Vector store registration removal
tests/e2e-prow/rhoai/configs/run.yaml
Removed the all-mpnet-base-v2 embedding model and its FAISS vector store registration (including embedding dimension/model id and vector_store_id), replaced with an empty vector_stores: [].
Build image argument removal
tests/e2e-prow/rhoai/pipeline.sh
Removed the explicit --image="registry.access.redhat.com/ubi9/ubi-minimal" argument from the oc new-build invocation, leaving the docker strategy to use default image behavior.

Estimated code review effort: 1 (Trivial) | ~5 minutes

Possibly related PRs

Suggested reviewers: tisnik, radofuchs

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the e2e startup failure being fixed and matches the changeset's focus.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@tisnik tisnik left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@are-ces are-ces marked this pull request as draft July 2, 2026 11:05
are-ces and others added 3 commits July 2, 2026 13:07
…ults

The generic lightspeed-stack.yaml uses default_model=gpt-4o-mini and
default_provider=openai, which don't exist in the RHOAI environment.
This caused 500/404 errors in tests that rely on default model resolution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Create vllm-model-secret from MODEL_NAME in pipeline.sh so both
llama-stack and lightspeed-stack pods read the model from a single
source instead of hardcoding it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When E2E_DEFAULT_MODEL_OVERRIDE or E2E_DEFAULT_PROVIDER_OVERRIDE env
vars are set, patch default_model/default_provider in every config
applied via update_config_configmap. This ensures tests that swap to
configs like lightspeed-stack-auth-noop-token.yaml use the correct
model for the environment instead of hardcoded gpt-4o-mini.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants