ci: refactor/optimize Docker builds (#10387)

* ci: docker refactor - WIP

* retry

* fix

* improvements

* retry

* try amd64 only

* retry

* try arm

* try uv

* clean

* fix

* revert test changes

* use venv and test

* fix

* try simplification

* try setting stable

* better bake

* again

* clean up again

* fix

* try caching and test

* try separate caches

* trigger cache invalidation

* try triggering again

* Revert "try triggering again"

This reverts commit bbc42a6e2a.

* Revert "trigger cache invalidation"

This reverts commit b37310b9a1.

* Revert "try separate caches"

This reverts commit 9ef381c9aa.

* Revert "try caching and test"

This reverts commit e50d57611a.
This commit is contained in:
Stefano Fiorucci
2026-01-16 10:46:03 +01:00
committed by GitHub
parent f63a2faa48
commit 0cdccae226
4 changed files with 33 additions and 32 deletions

View File

@@ -5,9 +5,12 @@ on:
push:
branches:
- main
paths-ignore:
- 'docs/**'
- 'docs-website/**'
paths:
- '.github/workflows/docker_release.yml'
- 'docker/**'
- 'haystack/**'
- 'pyproject.toml'
- 'VERSION.txt'
tags:
- "v2.[0-9]+.[0-9]+*"
@@ -42,8 +45,13 @@ jobs:
with:
images: $DOCKER_REPO_NAME
- name: Detect stable version
run: |
if [[ "${{ steps.meta.outputs.version }}" =~ ^2\.[0-9]+\.[0-9]+$ ]]; then
echo "IS_STABLE=true" >> "$GITHUB_ENV"
fi
- name: Build base images
uses: docker/bake-action@v5
uses: docker/bake-action@v6
env:
IMAGE_TAG_SUFFIX: ${{ steps.meta.outputs.version }}
HAYSTACK_VERSION: ${{ steps.meta.outputs.version }}

View File

@@ -7,29 +7,26 @@ ARG DEBIAN_FRONTEND=noninteractive
ARG haystack_version
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential \
git
apt-get install -y --no-install-recommends git
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
# Shallow clone Haystack repo, we'll install from the local sources
RUN git clone --depth=1 --branch=${haystack_version} https://github.com/deepset-ai/haystack.git /opt/haystack
WORKDIR /opt/haystack
# Use a virtualenv we can copy over the next build stage
RUN python3 -m venv --system-site-packages /opt/venv
# Note: we use venv and not uv to create the virtualenv to make sure that the created virtualenv is accessible by pip
# and prevent breaking changes in the image. uv can still be used to speed up installation.
RUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# Upgrade setuptools due to https://nvd.nist.gov/vuln/detail/CVE-2022-40897
RUN pip install --upgrade pip && \
pip install --no-cache-dir -U setuptools && \
pip install --no-cache-dir .
RUN uv pip install --no-cache-dir -U setuptools && \
uv pip install --no-cache-dir .
FROM $base_image AS final
COPY --from=build-image /opt/venv /opt/venv
COPY --from=deepset/xpdf:latest /opt/pdftotext /usr/local/bin
# pdftotext requires fontconfig runtime
RUN apt-get update && apt-get install -y libfontconfig && rm -rf /var/lib/apt/lists/*
ENV PATH="/opt/venv/bin:$PATH"

View File

@@ -4,35 +4,25 @@
[Haystack](https://github.com/deepset-ai/haystack) is an end-to-end LLM framework that allows you to build applications powered by LLMs, Transformer models, vector search and more. Whether you want to perform retrieval-augmented generation (RAG), document search, question answering or answer generation, Haystack can orchestrate state-of-the-art embedding models and LLMs into pipelines to build end-to-end NLP applications and solve your use case.
## Haystack 2.0
## Haystack 2.x
For the latest version of Haystack there's only one image available:
- `haystack:base-<version>` contains a working Python environment with Haystack preinstalled. This image is expected to
be derived `FROM`.
## Haystack 1.x image variants
The Docker image for Haystack 1.x comes in six variants:
- `haystack:gpu-<version>` contains Haystack dependencies as well as what's needed to run the REST API and UI. It comes with the CUDA runtime and is capable of running on GPUs.
- `haystack:cpu-remote-inference-<version>` is a slimmed down version of the CPU image with the REST API and UI. It is specifically designed for PromptNode inferencing using remotely hosted models, such as Hugging Face Inference, OpenAI, Cohere, Anthropic, and similar.
- `haystack:cpu-<version>` contains Haystack dependencies as well as what's needed to run the REST API and UI. It has no support for GPU so must be run on CPU.
- `haystack:base-gpu-<version>` only contains the Haystack dependencies. It comes with the CUDA runtime and can run on GPUs.
- `haystack:base-cpu-remote-inference-<version>` is a slimmed down version of the CPU image, specifically designed for PromptNode inferencing using remotely hosted models, such as Hugging Face Inference, OpenAI, Cohere, Anthropic, and similar.
- `haystack:base-cpu-<version>` only contains the Haystack dependencies. It has no support for GPU so must be run on CPU.
## Image Development
Images are built with BuildKit and we use `bake` to orchestrate the process.
You can build a specific image by running:
```sh
docker buildx bake gpu
docker buildx bake base
```
You can override any `variable` defined in the `docker-bake.hcl` file and build custom
images, for example if you want to use a branch from the Haystack repo, run:
```sh
HAYSTACK_VERSION=mybranch_or_tag BASE_IMAGE_TAG_SUFFIX=latest docker buildx bake gpu --no-cache
HAYSTACK_VERSION=mybranch_or_tag BASE_IMAGE_TAG_SUFFIX=latest docker buildx bake base --no-cache
```
### Multi-Platform Builds
@@ -51,7 +41,7 @@ To get around this, you need to override the `platform` option and limit local b
your computer's. For example, on an Apple M1 you can limit the builds to ARM only by invoking `bake` like this:
```sh
docker buildx bake base-cpu --set "*.platform=linux/arm64"
docker buildx bake base --set "*.platform=linux/arm64"
```
# License

View File

@@ -18,13 +18,19 @@ variable "BASE_IMAGE_TAG_SUFFIX" {
default = "local"
}
variable "HAYSTACK_EXTRAS" {
default = ""
variable "IS_STABLE" {
default = "false"
}
# 2.Y.Z releases are also tagged as "stable"
# Example: 2.99.0 is tagged as base-2.99.0 and stable
target "base" {
dockerfile = "Dockerfile.base"
tags = ["${IMAGE_NAME}:base-${IMAGE_TAG_SUFFIX}"]
tags = "${compact([
"${IMAGE_NAME}:base-${IMAGE_TAG_SUFFIX}",
equal("${IS_STABLE}", "true") ? "${IMAGE_NAME}:stable" : ""
])}"
args = {
build_image = "python:3.12-slim"
base_image = "python:3.12-slim"