vault/tools/pipeline/internal/pkg/git/client/commit.go
Vault Automation 34b5b5b2ff
[VAULT-39994] pipeline(changed-files): add support for listing and checking changed files (#12127) (#12215)
We've already deployed some changed file detection in the CI pipeline. It uses the Github API to fetch a list of all changed files on a PR and then run it through a simple groups categorization pass. It's been a useful strategy in the context of a Pull Request because it does not depend on the local state of the Git repo.

This commit introduces a local git-based file change detection and validation system for the pipeline tool, enabling developers to identify and validate changed files before pushing code. We intend to use the new tool in two primary ways:
  - As a Git pre-push hook when pushing new or updated branches. (Implemented here)
  - As part of the scheduled automated repository synchronization. (Up next, and it will use the same `git.CheckChangedFilesReq{}` implementation.

This will allow us to guard all pushes to `hashicorp/vault` and `ce/*` branches in `hashicorp/vault-enterprise`, whether run locally on a developer machine or in CI by our service user.

We introduce two new `pipeline` CLI commands:
  - `pipeline git list changed-files`
  - `pipeline git check changed-files`

Both support specifying what method of git inspection we want to use for the changed files list:
  - **`--branch <branch>`**: Lists all files added in the entire history of a specific branch. We use this when pushing a _new_ branch.
  - **`--range <range>`**: Lists all changed files within a commit range (e.g., `HEAD~5..HEAD`). We use this when updating an existing branch.
  - **`--commit <sha>`**: Lists all changed files in a specific commit (using `git show`). This isn't actually used at all in the pre-push hook but it useful if you wish to inspect a single commit on your branch.

The behavior when passing the `range` and `commit` is similar. We inspect the changed file list either for one or many commits (but with slightly different implementations for efficiency and accuracy.  The `branch` option is a bit different. We use it to inspect the branches entire history of changed files for enterprise files before pushing a new branch. We do this to ensure that our branch doesn't accidentally add and then subsequently remove enterprise files, leaving the contents in the history but nothing obvious in the diff.

Each command supports several different output formats. The default is the human readable text table, though `--format json` will write all of the details as valid JSON to STDOUT. When given the `--github-output` command each will write a more concise version of the JSON output to `$GITHUB_OUTPUT`. It differs from our standard JSON output as it has been formatted to be easier to use in Github Actions contexts without requiring complex filtering.

When run, changed files are automatically categorized into logical groups based on their file name, just like our existing changed file detection. A follow-up to this PR will introduce a configuration based system for classifying file groups. This will allow us to create generic support for changed file detection so that many repositories can adopt this pattern. 

The major difference in behavior between the two new commands is that the `list` command will always list the changed files for the given method/target, while the `check` command requires one-or-more changed file groups that we want to disallow to be included via the `-g` flag. If any changed files match the given group(s) then the command will fail. That allows us to specify the `enterprise` group and disallow the command to succeed if any of the changed files match the group.

The pre-push git hook now uses this system to prevent accidental pushes, however, it requires the local machine to have the `pipeline` tool in the `$PATH`. This ought not be much of a requirement as a working Go toolchain is required for any Vault developer. When it is not present we explain in our error messages how to resolve the problem and direct them to our slack channel if they need further assistance.

Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
2026-02-05 22:37:08 +00:00

218 lines
5.1 KiB
Go

// Copyright IBM Corp. 2016, 2025
// SPDX-License-Identifier: BUSL-1.1
package client
import (
"context"
"fmt"
"strings"
)
type (
// Cleanup are cleanup modes
Cleanup string
// FixupLog configures how to fix up a commit
FixupLog string
// UntrackedFiles are how to show untracked files
UntrackedFiles string
)
const (
CommitCleanupModeString Cleanup = "strip"
CommitCleanupModeWhitespace Cleanup = "whitespace"
CommitCleanupModeVerbatim Cleanup = "verbatim"
CommitCleanupModeScissors Cleanup = "scissors"
CommitCleanupModeDefault Cleanup = "default"
CommitFixupLogNone FixupLog = ""
CommitFixupLogAmend FixupLog = "amend"
CommitFixupLogReword FixupLog = "reword"
UntrackedFilesNo UntrackedFiles = "no"
UntrackedFilesNormal UntrackedFiles = "normal"
UntrackedFilesAll UntrackedFiles = "all"
)
// CommitFixup is how to fixup a commit
type CommitFixup struct {
FixupLog
Commit string
}
// CommitOpts are the git commit flags and arguments
// See: https://git-scm.com/docs/git-commit
type CommitOpts struct {
// Options
All bool // --all
AllowEmpty bool // --allow-empty
AllowEmptyMessage bool // --allow-empty-message
Amend bool // --amend
Author string // --author=<author>
Branch bool // --branch
Cleanup Cleanup // --cleanup=<mode>
Date string // --date=<date>
DryRun bool // --dry-run
File string // --file=<file>
Fixup *CommitFixup // --fixup=
GPGSign bool // --gpgsign
GPGSignKeyID string // --gpgsign=<key-id>
Long bool // --long
Patch bool // --patch
Porcelain bool // --porcelain
Message string // --message=<message>
NoEdit bool // --no-edit
NoPostRewrite bool // --no-post-rewrite
NoVerify bool // --no-verify
Null bool // --null
Only bool // --only
Quiet bool // --quiet
ResetAuthor bool // --reset-author
ReuseMessage string // --reuse-message=<commit>
Short bool // --short
Signoff bool // --signoff
Status bool // --status
Verbose bool // --verbose
// Target
PathSpec []string // <pathspec>
}
// Commit runs the git commit command
func (c *Client) Commit(ctx context.Context, opts *CommitOpts) (*ExecResponse, error) {
return c.Exec(ctx, "commit", opts)
}
// String returns the options as a string
func (o *CommitOpts) String() string {
return strings.Join(o.Strings(), " ")
}
// Strings returns the options as a string slice
func (o *CommitOpts) Strings() []string {
if o == nil {
return nil
}
opts := []string{}
if o.All {
opts = append(opts, "--all")
}
if o.AllowEmpty {
opts = append(opts, "--allow-empty")
}
if o.AllowEmptyMessage {
opts = append(opts, "--allow-empty-message")
}
if o.Amend {
opts = append(opts, "--amend")
}
if o.Author != "" {
opts = append(opts, fmt.Sprintf("--author=%s", o.Author))
}
if o.Branch {
opts = append(opts, "--branch")
}
if o.Cleanup != "" {
opts = append(opts, fmt.Sprintf("--cleanup=%s", string(o.Cleanup)))
}
if o.Date != "" {
opts = append(opts, fmt.Sprintf("--date=%s", o.Date))
}
if o.DryRun {
opts = append(opts, "--dry-run")
}
if o.File != "" {
opts = append(opts, fmt.Sprintf("--file=%s", o.File))
}
if o.Fixup != nil {
if o.Fixup.FixupLog == CommitFixupLogNone {
opts = append(opts, fmt.Sprintf("--fixup=%s", string(o.Fixup.Commit)))
} else {
opts = append(opts, fmt.Sprintf("--fixup=%s:%s", string(o.Fixup.FixupLog), string(o.Fixup.Commit)))
}
}
if o.GPGSign {
opts = append(opts, "--gpg-sign")
}
if o.GPGSignKeyID != "" {
opts = append(opts, fmt.Sprintf("--gpg-sign=%s", o.GPGSignKeyID))
}
if o.Long {
opts = append(opts, "--long")
}
if o.Patch {
opts = append(opts, "--patch")
}
if o.Porcelain {
opts = append(opts, "--porcelain")
}
if o.Message != "" {
opts = append(opts, fmt.Sprintf("--message=%s", o.Message))
}
if o.NoEdit {
opts = append(opts, "--no-edit")
}
if o.NoPostRewrite {
opts = append(opts, "--no-post-rewrite")
}
if o.NoVerify {
opts = append(opts, "--no-verify")
}
if o.Null {
opts = append(opts, "--null")
}
if o.Only {
opts = append(opts, "--only")
}
if o.Quiet {
opts = append(opts, "--quiet")
}
if o.ResetAuthor {
opts = append(opts, "--reset-author")
}
if o.ReuseMessage != "" {
opts = append(opts, fmt.Sprintf("--reuse-message=%s", o.ReuseMessage))
}
if o.Short {
opts = append(opts, "--short")
}
if o.Status {
opts = append(opts, "--status")
}
if o.Verbose {
opts = append(opts, "--verbose")
}
// If there's a pathspec, append the paths at the very end
if len(o.PathSpec) > 0 {
opts = append(append(opts, "--"), o.PathSpec...)
}
return opts
}