# CI Pipelines

sgrep fits naturally into continuous integration pipelines. Use semantic search to enforce code quality standards, detect patterns, and gate deployments.

## GitHub Actions

### Basic Quality Gate

Fail the pipeline if sensitive patterns are found:

```yaml
name: Security Scan
on: [push, pull_request]

jobs:
  security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install sgrep
        run: curl -sSf https://sgrep.sh/install.sh | sh
      - name: Scan for hardcoded secrets
        run: |
          if sgrep --json "hardcoded API key or secret" src/ | jq '. | length > 0'; then
            echo "Found potential hardcoded secrets"
            exit 1
          fi
```

### PR Comment on Matches

Leave a comment on pull requests when issues are found:

```yaml
name: Code Review
on: pull_request

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install sgrep
        run: curl -sSf https://sgrep.sh/install.sh | sh
      - name: Check for TODO comments
        id: scan
        run: |
          result=$(sgrep --json "TODO or FIXME" src/)
          echo "result<<EOF">$GITHUB_OUTPUT
          echo "$result">$GITHUB_OUTPUT
          echo EOF>>$GITHUB_OUTPUT
      - uses: actions/github-script@v7
        if: steps.scan.outputs.result != '[]'
        with:
          script: |
            const result = JSON.parse('${{ steps.scan.outputs.result }}');
            const body = result.map(r => `- **${r.path}:${r.line}**\n  ${r.text}`).join('\n');
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `## Found ${result.length} TODOs:\n${body}`
            })
```

### Caching the Model

Speed up builds by caching the downloaded model:

```yaml
- name: Cache sgrep model
  uses: actions/cache@v4
  with:
    path: ~/.cache/sgrep
    key: sgrep-model-${{ hashFiles('**/Cargo.lock') }}
```

## GitLab CI

### Security Scanning

```yaml
stages:
  - test

security:
  stage: test
  image: rust:latest
  before_script:
    - cargo install sgrep --git https://github.com/henrikalbihn/sgrep
  script:
    # Scan for sensitive patterns
    - |
      matches=$(sgrep --json "password or credential or secret" src/)
      if [ "$matches" != "[]" ]; then
        echo "Security issues found:"
        echo "$matches" | jq -r '.[] | "\(.path):\(.line): \(.text)"'
        exit 1
      fi
  allow_failure: false
```

### Quality Metrics

Track code quality over time:

```yaml
quality:
  stage: test
  script:
    - cargo install sgrep
    # Count complex functions (heuristic)
    - complex=$(sgrep --json "complex function or nested logic" src/ | jq 'length')
    - echo "COMPLEX_FUNCTIONS=$complex" >> metrics.txt
    - echo "DETECTED_PATTERNS=$complex" > metrics.env
  artifacts:
    reports:
      metrics: metrics.txt
```

## Jenkins Pipeline

```groovy
pipeline {
    agent any
    stages {
        stage('Install sgrep') {
            steps {
                sh 'curl -sSf https://sgrep.sh/install.sh | sh'
            }
        }
        stage('Code Scan') {
            steps {
                script {
                    def matches = sh(
                        script: 'sgrep --json "deprecated API usage" src/',
                        returnStdout: true
                    )
                    def count = readJSON(text: matches).size()
                    if (count > 0) {
                        unstable("Found ${count} deprecated API uses")
                    }
                }
            }
        }
    }
}
```

## Parsing JSON Output

The `--json` flag outputs structured data that's easy to parse:

```bash
# Count total matches
count=$(sgrep --json "pattern" src/ | jq 'length')

# Get only file paths
sgrep --json "pattern" src/ | jq -r '.[].path' | sort -u

# Extract specific fields
sgrep --json "pattern" src/ | jq -r '.[] | "\(.path):\(.line)"'

# Filter by score threshold
sgrep --json "pattern" src/ | jq 'map(select(.score > 0.8))'

# Group by file
sgrep --json "pattern" src/ | jq 'group_by(.path) | map({path: .[0].path, count: length})'
```

## Common Patterns

### Enforce Documentation Coverage

```bash
# Fail if public functions lack docs
public_funcs=$(grep -r "^pub fn" src/ | wc -l)
doc_comments=$(sgrep --json "public function documentation" src/ | jq 'length')

if [ "$doc_comments" -lt "$public_funcs" ]; then
  echo "Some public functions lack documentation"
  exit 1
fi
```

### Detect Deprecated Patterns

```bash
# Find deprecated API usage
sgrep --json "deprecated or obsolete" src/ | jq -r '.[] | "\(.path):\(.line): \(.text)"'
```

### Check for Error Handling

```bash
# Find functions without error handling
if sgrep "function without error handling" src/ | grep -q "unsafe"; then
  echo "Found potential error handling issues"
  exit 1
fi
```

> **Note**: sgrep works offline and makes no network requests. The model is downloaded once and cached locally, making it ideal for air-gapped environments.

## Performance Tips

- Use `--json` output for efficient parsing
- Cache the model directory for faster builds
- Run sgrep on specific directories, not the entire repo
- Combine with `jq` for powerful filtering and aggregation
- Consider parallelizing scans for large codebases

```bash
# Parallel scan across multiple directories
find src -type d | parallel -j4 'sgrep --json "pattern" {}'
```