EntFsCorr.jl documentation

Head and Neck frozen section correlation.

This package performs data manipulation from raw CoPath data exports.

Index

Public Interface

EntFsCorr.classify_final_dxMethod
classify_final_dx(topline)

Categorize a string into an overall diagnostic category.

This function uses pattern matching to perform classification. The expected input is the top-line of the final diagnosis.

Examples

julia> classify_final_dx("Parathyroid adenoma")
"BENIGN"

julia> classify_final_dx("Moderate dysplasia")
"INTERMEDIATE"

julia> classify_final_dx("Squamous cell carcinoma")
"MALIGNANT"

julia> classify_final_dx("")

julia> classify_final_dx(missing)

julia> # this outputs `nothing`
source
EntFsCorr.classify_final_marginMethod
classify_final_margin(description, final_topline, final_rest[, overall_status])

Categorize final margin status from final diagnosis fields.

If overall_status is unspecified, it will be computed from classify_final_dx(final_topline).

Examples

julia> classify_final_margin("Jaw removal", "Squamous cell carcinoma", "Margins free of tumour")
"BENIGN"

julia> classify_final_margin("Jaw removal", "Squamous cell carcinoma", "Carcinoma extends to the margin")
"MALIGNANT"

julia> classify_final_margin("Jaw removal", "Gossypiboma", "The entire margin is involved by tumor", "BENIGN")

julia> # no output since diagnosis is benign
source
EntFsCorr.final_tableMethod
final_table(vec_accession, vec_frozen_text)

Return a two-dimensional vector of case numbers and their associated parsed final diagnosis information.

Example

julia> df = load_raw_data()
[...]

julia> frozens = final_table(df.specnum, df.final_text)
27301-element Vector{Vector}:
[...]
source
EntFsCorr.frozen_tableMethod
frozen_table(vec_accession, vec_frozen_text)

Return a two-dimensional vector of case numbers and their associated parsed frozen information.

Example

julia> df = load_raw_data()
[...]

julia> frozens = frozen_table(df.specnum, df.frozen_text)
15615-element Vector{Vector}:
[...]
source
EntFsCorr.load_raw_dataMethod
load_raw_data([path])

Read the first sheet of an XLSX file at path into a DataFrame

Example

julia> df = load_raw_data("path/to/file.xlsx")
6179×12 DataFrame
  Row │ [...]
source
EntFsCorr.parse_field_bMethod
parse_field_b(str)

Categorize field B of frozen section reports.

Field B should already be designated as one of "BENIGN", "DEFER" or "MALIGNANT", but occasionally other strings may have been entered by the frozen section pathologist.

Examples

julia> parse_field_b("Carcinoma")
"MALIGNANT"

julia> parse_field_b("Negative")
"BENIGN"

julia> parse_field_b("Moderate dysplasia")
"DEFER"

julia> parse_field_b("Weight only")
"GROSS"

julia> parse_field_b("Parathyroid tissue identified")
"OTHER"

julia> parse_field_b(missing)
missing
source
EntFsCorr.FsDb.create_finals_dataframeMethod
create_finals_dataframe(data)

Create a DataFrame of parsed final diagnosis data.

This is a utility function to correctly call final_dataframe and final_table in combination.

Example

julia> df = load_raw_data();

julia> create_final_dataframe(df)
27301×8 DataFrame
   Row │ specnum      part    add_part  description                        final_topline                      final_rest  ⋯
       │ Union…       Union…  Union…    Union…                             Union…                             Union…      ⋯
───────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
     1 │ [...]
source
EntFsCorr.FsDb.create_fs_dataframeMethod
create_fs_dataframe(data)

Create a DataFrame of parsed frozen data.

This is a utility function to correctly call frozen_dataframe and frozen_table in combination.

Example

julia> df = load_raw_data();

julia> create_fs_dataframe(df)
15615×11 DataFrame
   Row │ specnum      part    block   add_part  add_block  description                        field_a                     ⋯
       │ Union…       Union…  Union…  Union…    Union…     Union…                             Union…                      ⋯
───────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
     1 │ [...] 
source
EntFsCorr.Utils.matches_patternMethod
matches_pattern(patterns, str)

Check if str matches any Regex in patterns.

Examples

julia> using EntFsCorr.Utils: matches_pattern

julia> matches_pattern([r"foo", r"bar", r"baz"], "onefootwo")
true

julia> matches_pattern([r"foo", r"bar", r"baz"], "quuuuux")
false
source
EntFsCorr.Utils.uppercase_last_wordMethod
uppercase_last_word(str)

Return last word of str in uppercase.

Example

julia> using EntFsCorr.Utils: uppercase_last_word

julia> uppercase_last_word("T. Jones")
"JONES"
source

Internals

EntFsCorr.mainFunction
main()

Build SQL database from raw data.

This will read from hardcoded file paths and output to data/db.sqlite, so will not work without modifications to the underlying functions.

source
EntFsCorr.margin_status_from_textFunction
margin_status_from_text(text)

Guess overall status of margins from free text.

This assumes that the overall status of the specimen is malignant. Results for benign or non-neoplastic specimens are not tested.

Example

julia> EntFsCorr.margin_status_from_text("MARGINS FREE OF TUMOR")
"BENIGN"

julia> EntFsCorr.margin_status_from_text("CARCINOMA PRESENT AT THE MEDIAL-LATERAL MARGIN")
"MALIGNANT"
source
EntFsCorr.frozens_to_fieldsFunction
frozens_to_fields([case_number, ]str)

Parse frozen text.

Example

julia> frozen_text = "1AFS: Right anterior margin
           A.  Sufficient for diagnosis
           B.  Malignant
           C.  Squamous cell carcinoma, margin free of tumor
           D.  0.5 cm
           
       2AFS: Right inferior margin
           A. Sufficient for diagnosis
           B. Benign
           C. No tumor present";

julia> EntFsCorr.frozens_to_fields(frozen_text)
2-element Vector{Vector{Union{Nothing, String}}}:
 ["1", "AFS", nothing, nothing, "Right anterior margin", "Sufficient for diagnosis", "Malignant", "Squamous cell carcinoma, margin free of tumor", "0.5 cm", nothing]
 ["2", "AFS", nothing, nothing, "Right inferior margin", "Sufficient for diagnosis", "Benign", "No tumor present", nothing, nothing]

julia> EntFsCorr.frozens_to_fields("ABC25-123", frozen_text)
2-element Vector{Vector{Union{Nothing, String}}}:
 ["ABC25-123", "1", "AFS", nothing, nothing, "Right anterior margin", "Sufficient for diagnosis", "Malignant", "Squamous cell carcinoma, margin free of tumor", "0.5 cm", nothing]
 ["ABC25-123", "2", "AFS", nothing, nothing, "Right inferior margin", "Sufficient for diagnosis", "Benign", "No tumor present", nothing, nothing]
source
EntFsCorr.finals_to_fieldsFunction
finals_to_fields([case_number, ]str)

Parse final text.

Example

julia> final_text = "Part 1. Thyroid, left, lobectomy (13 grams):
           A.  Follicular adenoma (4.5 cm).
           B.  Nodular thyroid hyperplasia.
           
       Part 2. Central lymph node, biopsy:
           Normocellular parathyroid.";

julia> EntFsCorr.finals_to_fields(final_text)
2-element Vector{Vector{Union{Nothing, String}}}:
 ["1", nothing, "Thyroid, left, lobectomy (13 grams)", "Follicular adenoma (4.5 cm)", "
    B.  Nodular thyroid hyperplasia.
    "]
 ["2", nothing, "Central lymph node, biopsy", "Normocellular parathyroid", nothing]

julia> EntFsCorr.finals_to_fields("ABC25-123", final_text)
2-element Vector{Vector{Union{Nothing, String}}}:
 ["ABC25-123", "1", nothing, "Thyroid, left, lobectomy (13 grams)", "Follicular adenoma (4.5 cm)", "
    B.  Nodular thyroid hyperplasia.
    "]
 ["ABC25-123", "2", nothing, "Central lymph node, biopsy", "Normocellular parathyroid", nothing]
source