mattermost/server/platform/services/docextractor/pdf_test.go
JG Heithcock 88954db3de
Some checks are pending
API / build (push) Waiting to run
Server CI / Compute Go Version (push) Waiting to run
Server CI / Check mocks (push) Blocked by required conditions
Server CI / Check go mod tidy (push) Blocked by required conditions
Server CI / check-style (push) Blocked by required conditions
Server CI / Check serialization methods for hot structs (push) Blocked by required conditions
Server CI / Vet API (push) Blocked by required conditions
Server CI / Check migration files (push) Blocked by required conditions
Server CI / Generate email templates (push) Blocked by required conditions
Server CI / Check store layers (push) Blocked by required conditions
Server CI / Check mmctl docs (push) Blocked by required conditions
Server CI / Postgres with binary parameters (push) Blocked by required conditions
Server CI / Postgres (shard 0) (push) Blocked by required conditions
Server CI / Postgres (shard 1) (push) Blocked by required conditions
Server CI / Postgres (shard 2) (push) Blocked by required conditions
Server CI / Postgres (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres Test Results (push) Blocked by required conditions
Server CI / Elasticsearch v8 Compatibility (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 0) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 1) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 2) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres FIPS Test Results (push) Blocked by required conditions
Server CI / Coverage (shard 0) (push) Blocked by required conditions
Server CI / Coverage (shard 1) (push) Blocked by required conditions
Server CI / Coverage (shard 2) (push) Blocked by required conditions
Server CI / Coverage (shard 3) (push) Blocked by required conditions
Server CI / Run mmctl tests (push) Blocked by required conditions
Server CI / Run mmctl tests (FIPS) (push) Blocked by required conditions
Server CI / Build mattermost server app (push) Blocked by required conditions
Tools CI / check-style (mattermost-govet) (push) Waiting to run
Tools CI / Test (mattermost-govet) (push) Waiting to run
Web App CI / check-lint (push) Waiting to run
Web App CI / check-i18n (push) Blocked by required conditions
Web App CI / check-external-links (push) Blocked by required conditions
Web App CI / check-types (push) Blocked by required conditions
Web App CI / test (platform) (push) Blocked by required conditions
Web App CI / test (mattermost-redux) (push) Blocked by required conditions
Web App CI / test (channels shard 1/4) (push) Blocked by required conditions
Web App CI / test (channels shard 2/4) (push) Blocked by required conditions
Web App CI / test (channels shard 3/4) (push) Blocked by required conditions
Web App CI / test (channels shard 4/4) (push) Blocked by required conditions
Web App CI / upload-coverage (push) Blocked by required conditions
Web App CI / build (push) Blocked by required conditions
[MM-63434] Use forked PDF library with parsing depth limit (#35947)
* [MM-63434] Use forked PDF library with parsing depth limit

Replace github.com/ledongthuc/pdf with a fork that limits object
nesting depth during parsing. Add test coverage.

* Reverting incorrect merge that lost the change to msgpack

The error was in merge 64bdff88d8
2026-04-14 10:28:59 -07:00

52 lines
1.4 KiB
Go

// Copyright (c) 2015-present Mattermost, Inc. All Rights Reserved.
// See LICENSE.txt for license information.
package docextractor
import (
"bytes"
"testing"
"github.com/stretchr/testify/require"
"github.com/mattermost/mattermost/server/v8/channels/utils/testutils"
)
func TestPdfEmptyFile(t *testing.T) {
extractor := pdfExtractor{}
_, err := extractor.Extract("test.pdf", bytes.NewReader([]byte{}), 0)
require.Error(t, err)
}
func TestPdfFile(t *testing.T) {
extractor := pdfExtractor{}
contentText := "\nThis is a simple document that contains some text."
content, err := testutils.ReadTestFile("sample-doc.pdf")
require.NoError(t, err)
extractedText, err := extractor.Extract("sample-doc.pdf", bytes.NewReader(content), 0)
require.NoError(t, err)
require.Equal(t, contentText, extractedText)
}
func TestPdfDeeplyNestedObjects(t *testing.T) {
// Test for MM-63434
var buf bytes.Buffer
buf.WriteString("%PDF-1.0\n")
for range 10_000 {
buf.WriteString("0\n0\nobj\n")
}
buf.WriteString("startxref\n0\n%%EOF\n")
extractor := pdfExtractor{}
text, err := extractor.Extract("excessive-nests.pdf", bytes.NewReader(buf.Bytes()), 0)
require.Error(t, err)
require.Empty(t, text)
}
func TestWrongPdfFile(t *testing.T) {
extractor := pdfExtractor{}
content, err := testutils.ReadTestFile("sample-doc.docx")
require.NoError(t, err)
_, err = extractor.Extract("sample-doc.pdf", bytes.NewReader(content), 0)
require.Error(t, err)
}