c02_data_integrity_control | hebrew text hypothesis

02_control_data_integrity

BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 01_workpapers___list_of_workpapers_in_c02
workpaper id: 01___workpaper_summary___d
first_published: YYYY-MM-DD
last_updated: YYYY-MM-DD

1_overview:
class: context
value: workpaper summary for control 02: data integrity control

2_source:
class: provenance
value: builder authored

3_dependencies:
class: dependency
value: not applicable, this is a workpaper summary

4_exclusions:
class: exclusion
value: not applicable, this is a workpaper summary

5_subject_matter:
class: substance
value: expanded section
workpapers (machine readable + human verifiable)
01.01___workpaper_summary_d

02.02_control_overview_d
02.02_control_container_excel_d
02.03_control_example_excel_d

03.01_sha256_concepts_d
03.02_sha256_considerations_d
03.03_sha256_source_d

04.01_vba_id_d
04.02_vba_container_d
04.03_vba_environment_requirements_d
04.04_vba_script_d

05.01_python_id_d
05.01_python_id_d
05.03_python_environment_requirements_d
05.04_python_script_d

appendixes (human aids, machine ignore)
a.01_vba_set_up_notes_a
a.02_python_background_notes_a

flowcharts (human aids, machine ignore)
f.01_control_overview_fc
f.02_data_transformation_simplefc
f.03_data_transformation_expandedfc
f.04_data_reversibility_boudary-map_fc
f.05_vba_flow___fc

6_constraints:
class: constraint
value: not applicable, control policy documentation only

7_reproducibility:
class: reproducibility
value: not applicable, controls subjectively implemented by builder

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 02_workpapers___control_overview
workpaper id: 02.01___control_overview___D
first_published: YYYY-MM-DD
last_updated: YYYY-MM-DD

1_overview:
class: context
value: overview of control 02: data integrity control

2_source:
class: provenance
value: builder designed control (no external standard asserted)

3_dependencies:
class: dependency
value: not applicable, this is a workpaper summary

4_exclusions:
class: exclusion
value: expanded section
control design subjectivity
control creation is subjectively designed by builder - another builder may choose a different control
errors introduced prior to static computation generation are outside scope

correctness of source data
this control does not validate correctness of source data
errors introduced prior to static computation generation are outside scope

intentional manipulation
the control does not detect intentional data manipulation occurring prior to static computation generation

other human error scope
the control does not prevent or correct human error outside the specified control process

semantic meaning
this control does not detect semantic equivalence
semantic equivalence = meaning of control data

5_subject_matter:
class: substance
value: expanded section
assurances
if the control passes static and live computation results are equal at the evaluated scope
this provides assurance that no detectable change has occurred to the selected data since static computation was generated
this assurance applies only to the exact data selection encoding and boundaries used
this assurance does not extend beyond the evaluated scope
this assurance does not assert correctness completeness or intent

cell error/non-value handling
if a cell is #N/A, #VALUE!, #DIV/0!, etc. the cell return is addressed at the occurance level
no resolution occurs within this control workpaper
resolution, if any, is done at the dataset level
evaluation and resolution are out of scope

change detection logic
change is detected when static and live computation results are not equal
detection is binary (change = 1 / no change = 0)
a value of 1 indicates the presence of change only; it does not quantify the number or magnitude of changes
no conclusion, diagnosis, or cause of change is interpreted within this logic

change resolution
detected changes are addressed at the occurrence level
no resolution occurs within this control
evaluation and resolution are out of scope

defined
control compares a static computation to a live computation for the purpose of detecting a change (if any)
changes detected by manually reviewing the change-indicator calculation cell(s), where:
static computation result = live computation result → 0
static computation result ≠ live computation result → 1
results are reviewed manually by the user
data selection defined by user and varies
computation is SHA-256

execution
static computation is generated once and stored; live computation is continuously derived from current data
equality comparison is executed by formula; interpretation is performed by the user
detected differences and confirmations of no change are subject to manual review
the control operates continuously and is not limited to a point-in-time execution

implementation safeguards (optional)
optional safeguards may be applied at implementation time
examples (non-exhaustive):
- freezing control columns
- protection against accidental formula overwrite
- separate storage of static computation
- table-based data selection
safeguard selection and implementation may vary by:
- database
- dataset
- execution environment (excel, vba, python)
absence of safeguards does not affect:
- change detection logic
- computation correctness
- sha-256 determinism

input basis
control computes SHA-256 over raw cell values, not displayed or formatted values

scope
the control applies to any user-defined data selection, including but not limited to:
single symbol(s), cell(s), column(s), row(s), table(s), or file-level outputs
control tests output equivalence only
the control operates on the selected data as a whole, not on semantic meaning
scope is defined at execution time by the user or system configuration

6_constraints:
class: constraint
value: expanded section
algorithm dependency
this control relies on deterministic computation, not on a specific algorithm
the computation algorithm used by the control may change over time
algorithm replacement does not alter the control logic or comparison method

excel related dependencies
correct execution of the computation function
integrity of the spreadsheet environment

nature of manual review control
this control requires manual review
manual processes are by nature subject to risk

unintentional manipulatiom
the control is designed to detect accidental and unintended changes
the control may surface intentional changes incidentally

7_reproducibility:
class: reproducibility
value: not applicable, this is a control overview

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 02_workpapers___control_overview
workpaper id: 02.02___control_container___D
first_published: YYYY-MM-DD
last_updated: YYYY-MM-DD

1_overview:
class: context
value: documents microsoft excel as the execution, storage, and review container for control

2_source:
class: provenance
value: microsoft office (commercial software product)

3_dependencies:
class: dependency
value: not applicable, workpaper is to document container not control itself

4_exclusions:
class: exclusion
value: not applicable, workpaper is to document container not control itself

5_subject_matter:
class: substance
value: expanded section
configuration considerations
excel behavior may vary based on:
excel version
regional settings
calculation mode (automatic vs manual)
workbook protection settings
these factors may affect reproducibility and must be controlled or acknowledged by user

container role
microsoft excel serves as the execution, storage, and review container for the di01 control
excel provides:
a structured grid for data selection
formula-based computation and comparison
persistent storage of static computation results
visual inspection of change-indicator outputs
excel does not define the control logic or computation mechanism

execution context
the DI01 control executes within excel via worksheet formulas
execution characteristics:
static computation results are stored as fixed values in cells
live computation results are continuously derived from current cell values
equality comparison is executed by formula
change-indicator output is visible immediately upon data modification
no batch job or scheduled execution is required

storage behavior
excel stores:
source data values
static computation results
live computation formulas
change-indicator formulas
excel does not enforce immutability of stored values
persistence relies on standard workbook save behavior

user interaction and review
the user:
defines data selection boundaries
initiates static computation generation
observes change-indicator outputs
performs manual review when change is detected
excel provides visibility but does not perform interpretation or resolution

6_constraints:
class: constraint
value: not applicable, workpaper is to document container not control itself

7_reproducibility:
class: reproducibility
value: not applicable, workpaper is to document container not control itself

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 02_workpapers___control_overview
workpaper id: 02.03___control_example_excel___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD

1_overview:
class: context
value: provide an illustrative example of control 02

2_source:
class: provenance
value: builder-authored

3_dependencies:
class: dependency
value: not applicable, workpaper is illustrative only

4_exclusions:
class: exclusion
value: not applicable, workpaper is illustrative only

5_subject_matter:
class: substance
value: expanded
example control
illustrates control at illustrative level only
example demonstrates how control 02 detects change, not how sha256 is computed
formulas in "row 1" are descriptive not live excel formulas

control output rule:
• Output 0 when static and live sha256 values are identical
• Output 1 when static and live sha256 values differ

control illustrative example
illustrative data = hdkslHSKD
static sha-256 stored sha-256 computation result
live sha-256=computeded SHA-256 computation

computation formula=sha256_text(hdksIHSKD)
static sha-256=e6386e29cc15b45e434e6c2316cecee7a79ac82ef068d1d390de52cf9472d960
live sha-256=e6386e29cc15b45e434e6c2316cecee7a79ac82ef068d1d390de52cf9472d960
control formula = IF(input=input,0,1)
=[static] = [live] = 0
=[static] ≠ [live] = 1
control result = 0 (two sha-256 computations agree

notes:
stored result execution method varies - i.e. vba in excel, python
there are multiple excel formulas for calculation of sha256 computation
another formula could be used to identify differences in sha256 static and live compuations

6_constraints:
class: constraint
value: not applicable, workpaper is illustrative only

7_reproducibility:
class: reproducibility
value:not applicable, workpaper is illustrative only

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 03_workpapers___sha256_computation_independent_of_ execution_ method
workpaper id: 03.01___sha256_concepts___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD

1_overview:
class: context
value: describe concepts related to sha-256 computation

2_source:
class: provenance
value: builder- authored

3_dependencies:
class: dependency
value: not applicable, concepts only - execution of control out of scope

4_exclusions:
class: exclusion
value: the internal mathematical operations that perform the sha-256 computation are out of scope for this documentation

5_subject_matter:
class: substance
value: expanded section

concepts grouped functionally, not alphabetically

5.1 concept notes: not concept specific

descriptions are intentionally brief

descriptions grouped by function not alphabetical

descriptions are functional, not dictionary-based

concepts included were subjectively chosen by builder with GPT assistance

5.2 primitives: smallest units a system operates on directly

primitive

primitive the smallest unit a system treats as indivisible for its intended purpose within that system’s abstraction boundary

may have meaning, but is not decomposed further within the system

combined to build more complex structures, but are not broken down further within the system

5.3 display primitives: what is converted to computer primitives

symbol

a character (ie. A, 9, space, :, emoji, etc.)

some are human readable, and some are not (whitespace, non-printing characters, etc.)

symbols (sha-256)

not processed directly by sha-256

are interpreted only after encoding converts them into bytes

data scope

defined by explicit selection and boundary prior to computation by the end user

can be a single character, an invisible character (spaces, line breaks, formatting), a word, a row, a table, a file, a folder, etc.

can include invisible characters (spaces, line breaks, formatting)

5.4 computer primitives: what computers see/interpret

bit

single binary value (0 or 1) — the smallest unit of data representation

the smallest unit computers use to represent information

byte

a group of bits

how computers group bits meaningfully

one byte (grouping of 8) has 256 possible patterns (arrangements of 0s and 1s)

byte pattern

the sequence ordered by bits (0s and 1s)

computers map representation by the arrangement of bits into bytes

5.5 encoding: process of translating human readable symbols to computer readable bytes

encoding

can be likened to translating - from symbol to bytes (groups of bits - 0s and 1s)

encoding inputs: symbols identified by character set (Unicode code points or equivalent) prior to encoding

the mapping of symbols to byte patterns

encoding systems

stable schemas or frameworks that map symbol(s) to bytes (patterns of bits)

the same symbols is represented with different byte patterns by different schemas

byte patterns must be stable within an encoding framework for results to be deterministic and comparable

code unit

the fixed-size unit of storage used by an encoding scheme to represent code points

not a character or a symbol

it is an intermediate representation between symbols and raw bytes

ode units are defined by the encoding scheme, not by sha-256

UTF-8 uses 8-bit code units; UTF-16 uses 16-bit code units

one code point may map to one or more code units depending on the encoding

UTF-8 (encoding)

a type of byte mapping system where symbols (Unicode code points) are mapped to one or more 8-bit code units (8 bits - 1 byte)

symbol-to-byte mapping is stable across all UTF-8 systems

a given code point always maps to the same byte sequence under UTF-8

the mapping is deterministic (identical inputs = identical byte patterns)

the mapping is reversible (byte pattern to symbol and symbol to byte pattern)

UTF-16 (encoding)

a type of byte mapping system where symbols (Unicode code points) are mapped to one or two 16-bit code units (16 bites = 2 byte) (surrogate pairs for code points above U+FFFF)

symbol-to-byte mapping is stable across all UTF-16 systems

a given code point maps to the same code unit sequence under UTF-16

the mapping is deterministic (identical inputs = identical byte patterns) and reversible

the mapping is reversible (byte pattern to symbol and symbol to byte pattern)

5.6 sha-256 algorithm input concepts

input

whatever symbols are being delivered to sha-256 algorithm must be bytes first

sha-256 does not know what encoding system is used, it recognizes bytes patterns only

sha-256 cannot see symbols, words, or meaning—only byte sequences

boundaries for byte input(s) are defined by user (look at word, row, columns a + b, entire table, etc.)

input and encoding

different encodings ⇒ different byte arrays ⇒ different sha-256 output

Same encoding + same symbols ⇒ same bytes ⇒ same sha-256 output

assumes same byte order and concatenation order

possibility of collisions noted and deemed out of scope

byte stream construction

the order in which bytes are concatenated is significant

the same bytes in a different order produce a different result

e.g., concatenating columns A+B vs B+A produces different byte streams and different hashes

5.7 sha-256 algorithm and computation concepts

algorithm

a fixed set of rules for transforming input into output

the rules do not change based on content

only the input affects the result

deterministic

same inputs (same bytes) produce the same result

change in input bytes produces a different result except for cryptographic collisions

different encoding systems produce different results because they produce different byte patterns for the same symbol(s)

sha-256 algorithm

A fixed, deterministic algorithm that converts input bytes into a 32-byte output

SHA = Secure Hash Algorithm

256 bits fixed output (32 bytes × 8 bits per byte)

if algorithm changes, it is no longer sha-256

sha-256 computation

execution of the sha-256 algorithm

sha-256 determinism

assumes a fixed encoding and consistent byte stream construction

sha-256 stabilitysha-256 text specification has clarifications, but algorithm definition has not changed

any change would create a different algorithm, not sha-256

this stability is critical for compatibility, verification, and security

SHA -256 reversibility

raw output is non-reversible due to the many-to-one mapping from arbitrary-length inputs (effectively infinite possibilities) to a finite, fixed-length output space

the non-reversibility of the output makes the algorithm one way

5.8 sha-256 computation other factors (salting)

salting (explicitly excluded)

salt = extra bytes deliberately added to the input before sha-256 computation

no additional bytes are appended to input prior to sha-256 computation

extra data (salt) would cause identical inputs to produce different outputs when salt differs

salt is used to:

prevent precomputed (rainbow table) attacks

ensure identical passwords do not hash to the same value across different records

increase resistance to precomputation attacks

salt is good for passwords

salt is bad for deterministic comparison controls and will not be used

5.9 sha-256 computation other factors (keying)

keying (explicitly excluded)

key = secret value combined with input during sha-256 computation

key is applied deliberately before or during computation

keyed sha-256 computation produces different outputs for identical inputs when keys differ

how keying is used:

authentication and message integrity (e.g., HMAC)

ensures only parties with the key can reproduce or verify the hash

used to prove authenticity, not just detect change

why keying exists:

prevents unauthorized hash forgery

binds the hash result to possession of a secret

protects against tampering in adversarial environments

why keying is excluded here:

deterministic comparison requires identical inputs → identical outputs

secret keys would prevent independent precomputation

control is not performing authentication or trust enforcement

key management is out of scope

5.10 control position - keying and/or salting

no salt added

no secret keys used

sha-256 executed without key material

output comparability preserved across environments

5.11 sha-256 algorithm output concepts (machine readable)

output generalsha-256 takes bytes as input and produces bytes as output

different encoding systems produce different byte patterns

different byte input produces different byte output

32 byte output the raw result produced by the sha-256 algorithm

this output is always exactly 32 bytes (fixed output)

32 bytes x 8 bits per byte = 256 bits as in sha-256

raw output would be a grouping of 256 0s and 1s

below is made up example of one 32 byte output - 32 sets of 8 bits (0s and 1s)

01101010 00001001 11101010 11110110 00010101 11110010 10100101 11000111

00011010 10110101 11101101 00001111 10001001 00101010 10101011 00100101

01101010 00001001 11101010 11110110 00010101 11110010 10100101 11000111

00011010 10110101 11101101 00001111 10001001 00101010 10101011 00100101

5.12 representation change from 32 byte output to human readable (not done by sha-256)

problem with 32 byte output

raw output difficult for human to read and store

hexadecimal number system with a base of 16

decimal -- base 10 -- digits 0-9

hex -- 6 units -- a-f = 11-15

64 character hexadecimal string

a standard representation of the sha-256 output (after computation)

the 32-byte output is converted into a 64-character hexadecimal string

the conversion methodology is stable - same 32 byte patterns produce same 64 character hexadecimal strings

BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 03_workpapers___sha256_computation_independent_of_ execution_ method
workpaper id: 03.01___sha256_concepts___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD

1_overview:
class: context
value:

2_source:
class: provenance
value:

3_dependencies:
class: dependency
value:

4_exclusions:
class: exclusion
value:

5_subject_matter:
class: substance
value:

6_constraints:
class: constraint
value:

7_reproducibility:
class: reproducibility
value:

8_notes:
class: note
value:
END WORKPAPER