1 Introduction

This document outlines the comprehensive testing strategy for morph, a web-based text editor that explores new interfaces for machine-assisted creative writing. The plan ensures that morph meets its primary objectives: providing spatial interfaces for non-linear exploration of writing, implementing efficient feature steering through sparse autoencoders (SAEs), and maintaining a file-over-app philosophy that preserves user agency. The V&V activities described in this document cover both the web-based editor (morph) and the ML inference server (asteraceae). Key areas of validation include:

  • Verification of the planning interface and suggestion generation capabilities
  • Testing of feature steering mechanisms using SAEs
  • Validation of the OpenAI-compatible inference API
  • Assessment of system performance, particularly Time-to-First-Token (TTFT) and throughput
  • Evaluation of user interaction patterns and accessibility features
  • Verification of security measures and data privacy controls

2 General Information

2.1 Symbols, Abbreviations, and Acronyms

The following data dictionary from SRS will be used for the symbols:

7.2 Data Dictionary

NameContentType
morphmorph + ReverseProxy + asteraceaepackage
morphWeb interface for morphpackage
ReverseProxyA middleware for reverse proxy with load balancermodule
asteraceaeSearch + ActivationCache + SAEs + logit_processor + Exopackage
logit_processorA function to convert incoming requests to logitsfunction
exoInference engine to run given LLMpackage
LLMopen-weights models to be run for inferencepackage
BlockManagerHelp manage KV-cache during inferenceclass
SAEsa set of sparse autoencoders trained against given LLM to steer activations generationpackage
SearchAdditional search tool to improve correctnessmodule
ActivationCacheStore said generations activations for performance purposesclass
streaming JSONOutputs from asteraceae will be streamed back to morphtext
tensor matrix(A matrix of shape (n tokens × m embedding size), where each row is a float vector embedding for a token.) represented inputs processed by logit_processortext

see also logits and sparse autoencoders

Link to original

Additionally the following abbreviations will be used:

symbolsdescription
TTest
V&VVerification and Validation
UIUser Interface
SRSSoftware Requirement Specification
CIContinuous Integration

2.2 Purpose

V&V Plan outlines the testing, validation and verification procedures that are to be implemented morph.

revisions

Test cases mentioned in this document are subject to change based on the development progress and feedback from the supervisor and peers. The plan will be updated accordingly to ensure that all aspects of the system are thoroughly tested and validated.

2.3 Scope

The following entails the scope of the V&V plan for morph:

2.3.1 In scope V&V Objectives

  • Functional verification of the web-based editor (morph), including:

    • Text editing capabilities and file management
    • Planning interface functionality
    • Version control features
    • Configuration and preference management
    • Accessibility features and keyboard navigation
  • Validation of the ML inference server (asteraceae), covering:

    • Feature steering using sparse autoencoders (SAEs)
    • OpenAI-compatible API endpoints
    • Performance metrics (TTFT, throughput)
    • Batch processing capabilities
    • KV cache optimization
  • Security and compliance testing:

    • HTTPS encryption for all communications
    • Data privacy controls
    • Content filtering for harmful outputs
    • SOC 2 compliance validation
  • User experience validation:

    • Usability testing of core writing features
    • Response time measurements
    • Cross-browser compatibility
    • Theme customization
    • Accessibility compliance
  • Integration testing between components:

    • Editor-server communication
    • File system interactions
    • Feature steering feedback loop

2.3.2 Out of scope V&V Objectives

  • Testing of underlying language models (Gemma 2, Llama 3.1)
  • Validation of third-party libraries and frameworks (React, BentoML, vLLM)
  • Hardware-specific performance optimization
  • Testing on unsupported browsers or operating systems
  • Penetration testing of infrastructure components
  • Cloud provider-specific testing (beyond basic deployment verification)
  • Testing of external API integrations beyond OpenAI compatibility
  • Validation of multilingual support (initial release focuses on English)

Challenge Level and Extras

Challenge level for morph is advanced, since the project is exploring what is possible to build AI-native interfaces. It involves the field of mechanistic interpretability, which is a relatively new field in alignment research that involves a lot of domain knowledge into the inner workings of transformer circuits.

Extras that we wish to pursue: User Manual, Usability Testing.

Reasoning:

  1. User Manual: We plan to document the functionality and usage of morph to ensure that users can effectively understand and navigate the tool. This includes clear instructions, visual examples, and troubleshooting tips to make the tool more accessible.
  2. Usability Testing: As morph aims to explore new interactions for writing, it is important to validate our design and interaction patterns through user testing. Gathering feedback from real users will help refine our interface and ensure it supports a smooth and intuitive experience.
Link to original

2.4 Relevant documentation

See also Software Requirement Specification and Hazard Analysis documents for more information

3 Plan

This section provides general information about the plan of testing, including testing team members, milestones and detailed sections of verifications.

3.1 Verification and Validation Team

The following project members are responsible for all procedures of the validation process, including writing and executing tests:

Team MemberResponsibility
Aaron PhamSRS Verification: ensure the SRS document accurately describes system functionalities and constraints. Implementation Verification (specifically functional requirements and performance): Ensures implemented features align with the SRS specifications.
Nebras KhanDesign Verification(evaluate software architecture), UI/UX testing(Consistency and Usability), and system interaction validation(API interaction)
Waleed MalikAutomated Testing (focusing on backend logic, integration tests, and continuous integration pipelines)
Zhiwei LiSoftware Validation (end-to-end testing by simulating real user, regression testing for new features for each stage, and security testing for vulnerability)
Dr. Spencer SmithProject Supervisor
Dr. Swati MishraProject Supervisor
Hunter CeranicTeaching Assistent as Project Supervisor

Regular Guidance and Review:

  • Weekly reviews and feedback sessions with Dr. Swati Mishra to ensure testing progress aligns with project milestones and system requirements.
  • Additional support and guidance will be provided by TA Hunter Ceranic as needed, especially during critical testing phases.

Collaborative Testing:

  • Verification and validation tasks related to large documents (e.g., the SRS) will be divided among team members based on their expertise. Multiple team members may collaborate on key sections to ensure thorough coverage and efficiency.
  • Overlapping areas will be coordinated during team meetings to avoid redundancy and ensure clear role allocation.

3.2 Milestones

Following table contains dates relevant to important milestones

TestingExpected Date of Completion
Look and Feel Testing2025-01-10
Usability and Humanity Testing2025-01-15
Performance Testing2025-01-20
Security Testing2025-02-20
Maintainability and Support Testing2025-02-05
Compliance Testing2025-01-20
Unit Testing2025-02-05
Tests for Nonfunctional Requirements2025-02-05

Table 3: Milestones

3.3 SRS Verification Plan

The verification of the SRS will follow a systematic approach to ensure completeness, consistency, and alignment with project goals. This process will occur over a one-month period and involve multiple stakeholders.

3.3.1 Review Approaches

  1. Supervisor Reviews

    • Weekly 15-minute discussion sessions during regular meetings
    • Focus on technical feasibility and scope alignment
    • Documentation of feedback and required changes
    • Follow-up verification of implemented changes
  2. Internal Team Reviews

    • Bi-weekly structured review sessions
    • Each team member assigned specific sections for detailed review
    • Cross-validation of requirements between related sections
    • Documentation of findings in a shared review log
  3. Requirements Traceability

    • Mapping requirements to design elements
    • Verification of requirement dependencies
    • Identification of gaps or overlaps
    • Documentation of requirement relationships
  4. External Peer Review:

    • Inviting feedback from peers outside the development team
    • Review of requirements documentation and clarity
    • Identification of potential conflicts or ambiguities
    • Documentation of external feedback and proposed changes
  5. Action Item Tracking

    • Logging action items or required changes as GitHub issues and assigning them to the appropriate team member for resolution
    • Maintaining a shared task tracker to list pending and completed action items, ensuring visibility and accountability
    • Discussing progress on action items during weekly supervisor meetings and internal reviews, with unresolved items flagged for follow-up
    • Updating corresponding sections in project documents upon completion of each action item

3.3.2 Structured Review Process

  1. Individual Review Phase (Week 1)

    • Team members review assigned sections
    • Document initial findings and concerns
    • Identify potential conflicts or ambiguities
    • Prepare discussion points for team review
  2. Team Review Phase (Week 2)

    • Group discussion of individual findings
    • Resolution of identified issues
    • Documentation of decisions and changes
    • Update of requirements based on feedback
  3. Supervisor Review Phase (Week 3)

    • Present consolidated findings
    • Incorporate supervisor feedback
    • Document required changes
    • Plan implementation timeline
  4. Final Verification Phase (Week 4)

    • Implement all approved changes
    • Final review of updates
    • Version control and documentation
    • Sign-off on verified requirements

3.3.3 Verification Checklist

Completeness

  • All functional requirements have clear acceptance criteria
  • Non-functional requirements are measurable and testable
  • User interface requirements cover all interaction points
  • Performance metrics are clearly defined
  • Security requirements address all identified risks

Consistency

  • No conflicts between requirements
  • Terminology is used consistently throughout
  • Requirements align with project goals
  • Dependencies are properly identified
  • Priorities are consistently applied

Clarity

  • Requirements are unambiguous
  • No use of unclear terms or jargon
  • Each requirement is atomic and verifiable
  • Rationale is provided for key decisions
  • Constraints are clearly stated

Technical Feasibility

  • Resource requirements are realistic
  • Performance targets are achievable
  • Technology choices are justified
  • Integration points are well-defined
  • Scalability considerations are addressed

Documentation Quality

  • Document follows template guidelines
  • Sections are properly cross-referenced
  • Version control is maintained
  • Changes are tracked and justified
  • Supporting diagrams are clear and accurate

The verification process will be coordinated by Aaron Pham, with specific section ownership assigned to team members based on their areas of expertise. Progress will be tracked through GitHub issues, and final verification results will be documented in the project repository.

3.4 Design Verification Plan

Design reviews will be conducted with supervisor during the weekly meet-up, performing iterative reviews to cover different parts of the design. Each review session will focus on verifying that design choices meet the functionality of performance needs specified in the SRS.

POC demo session will also be used for design verification. This process will take a month.

After the SRS review period, design verification will be conducted through iterative reviews with supervisors during weekly meetings. Each review session will focus on verifying that design choices meet the functional and performance requirements specified in the SRS.

3.4.1 Specification Verification

CategoryVerification ItemsAcceptance CriteriaMethod
ArchitectureSystem components and their interactionsMatches SRS architecture diagrams; Clear separation of concernsDesign review, Architecture walkthrough
Data FlowData handling between componentsFollows file-over-app philosophy; Efficient data transmissionFlow diagram review, Data path analysis
Interface DesignAPI contracts and UI componentsOpenAI-compatible endpoints; Consistent UI patternsAPI spec review, UI mockup validation
Security DesignAuthentication and data protectionMeets SOC2 requirements; Implements secure communicationSecurity design review
DeploymentInfrastructure and scaling strategySupports auto-scaling; Enables zero-downtime updatesInfrastructure diagram review

3.4.2 Functional Verification

ComponentDesign AspectsVerification MethodSuccess Criteria
Text EditorBuffer management, File operationsCode review, Prototype testingSmooth text manipulation, File persistence
Planning InterfaceLayout, Interaction flowUI/UX review, User testingIntuitive navigation, Clear feedback
Feature SteeringSAE integration, Activation cachingTechnical review, Performance testingAccurate steering, Efficient caching
Version ControlDiff tracking, History managementDesign review, Data structure validationReliable version tracking
Inference ServerRequest handling, Model integrationAPI review, Load testingMeets TTFT requirements

3.4.3 Performance Verification

MetricTargetMeasurement MethodVerification Approach
TTFT200-500msResponse time monitoringLoad testing with varying batch sizes
Throughput300 tokens/sec (batch=4)Request processing rateConcurrent request simulation
Memory Usage<500MB per sessionResource monitoringMemory profiling during operations
UI Responsiveness<50ms input latencyInput lag measurementUI performance testing
Cache Hit Rate>80% for common operationsCache statisticsOperation pattern analysis

3.4.4 Rationale and Benchmark References for Performance Metrics

The selected performance metrics for morph are informed by community benchmarks and design considerations typical of real-time LLM-assisted applications. These targets are meant to ensure low-latency interactions without requiring large-scale infrastructure.

  • Time to First Token (TTFT): The 200–500ms range is consistent with expectations for latency-sensitive tools like chat interfaces and writing assistants. Although OpenAI does not provide exact TTFT values, their rate limit documentation emphasizes prompt, sub-second responses. This threshold ensures users receive suggestions quickly enough to maintain creative flow.

  • Throughput: A target of approximately 300 tokens/second at batch size 4 reflects typical performance observed in community benchmarks for models like LLaMA 2 and Mistral 7B using optimized inference engines such as vLLM. On modern GPUs like the A100 or RTX 4090, these rates are frequently achievable in single-instance deployments without distributed scaling.

These targets are chosen to ensure:

  • Responsive, interactive text generation
  • Compatibility with moderately powerful consumer hardware
  • Support for multiple users or documents without significant delays

They support morph’s vision of enabling fast, local-first, and user-centric machine-assisted writing environments.

3.4.5 Documentation Validation

Document TypeReview FocusValidation MethodReviewers
API DocumentationCompleteness, accuracySpec review, Example validationBackend team
User GuideClarity, coverageUser testing, Content reviewUX team
System ArchitectureConsistency, completenessTechnical reviewTech leads
Deployment GuideAccuracy, reproducibilityTrial deploymentDevOps team
Security DocumentationCompliance, comprehensivenessSecurity auditSecurity team

3.4.6 Review Schedule

  1. Initial Design Review (Week 1)

    • Architecture overview
    • Component interaction validation
    • Initial performance projections
  2. Detailed Component Review (Week 2)

    • Individual component design
    • Interface specifications
    • Data flow validation
  3. Performance Design Review (Week 3)

    • Caching strategy
    • Resource optimization
    • Scaling approach
  4. Integration Design Review (Week 4)

    • Component integration
    • End-to-end flow
    • System resilience

3.4.7 Validation Tools

CategoryToolPurposeUsage Phase
Code Analysisruff, eslintStatic analysisDevelopment
Performancek6, ArtilleryLoad testingIntegration
SecurityOWASP ZAPSecurity scanningPre-release
DocumentationValeStyle consistencyDocumentation
API TestingPostmanEndpoint validationDevelopment

3.4.8 Design Verification Checklist

  • Architecture aligns with SRS requirements
  • Component interfaces are well-defined
  • Performance targets are achievable with design
  • Security considerations are addressed
  • Scalability is built into design
  • Documentation is complete and accurate
  • Testing strategy is comprehensive
  • Resource requirements are specified
  • Error handling is well-defined
  • Monitoring and logging are planned

This design verification plan will be executed in parallel with the development process, with findings and updates tracked through GitHub issues. Regular reviews will ensure that the implementation stays aligned with the verified design.

3.5 Implementation Verification Plan

In-group code walkthrough inspection is chosen as the main method to complete implementation verification of the project. The coding contents will be reviewed by other team members in exchange asynchronously. Basic grammar error in coding will be caught by linter of each different components. Detailed test cases can be found in system testing.

Final presentation is also expected to be a part of verification of usability, and its feedback will be collected and reflected on the final release.

3.6 Automated Testing and Verification Tools

The following include tools and automated tests that will be used for the verification of the system.

The automated testing strategy for morph is divided into two main components: the web-based editor (morph) and the ML inference server (asteraceae). Each component has specific tooling requirements and testing frameworks to ensure comprehensive coverage.

3.6.1 morph

CategoryToolPurposeConfiguration
Unit TestingJestComponent-level testing- Custom matchers for React components
- Coverage threshold: 80%
- Snapshot testing for UI components
End-to-EndPlaywrightBrowser automation testing- Cross-browser testing (Chrome, Firefox, Safari)
- Device emulation for responsive testing
- Network condition simulation
PerformanceLighthouseFrontend performance metrics- Performance score target: >90
- First Contentful Paint < 1.5s
- Time to Interactive < 3.0s
Accessibilityaxe-coreWCAG compliance testing- WCAG 2.1 Level AA compliance
- Automated accessibility checks
- Color contrast validation
Static AnalysisESLintCode quality enforcement- TypeScript-specific rules
- React hooks linting
- Import sorting
Style CheckingPrettierCode formatting- 2-space indentation
- Single quotes
- No semicolons

3.6.2 asteraceae

CategoryToolPurposeConfiguration
Unit TestingpytestBackend logic testing- Asyncio support
- Fixture-based test data
- Parallel test execution
Performancek6Load and stress testing- TTFT verification
- Concurrent request handling
- Resource utilization monitoring
API TestinglocustAPI endpoint validation- OpenAI compatibility testing
- Response schema validation
- Error handling verification
Static AnalysisruffCode quality checks- Type checking
- Import sorting
- Code complexity limits
SecurityBanditSecurity vulnerability scanning- Known vulnerability checks
- Security best practices
- Dependency scanning

3.6.3 Critical Paths

Critical paths refer to components whose failure or malfunction would significantly compromise the system’s core functionality, user experience, or data integrity. These include but are not limited to:

  • Handles login flows, token generation, and access control.
  • Responsible for data exchange between the Morph frontend and Asteraceae backend, especially those processing user-generated input or returning AI-generated text.
  • Modules in Morph that manage file states, local storage, or editing buffers crucial for preserving user content.
  • Code that performs text generation, steering operations, or data validation in Asteraceae.
  • Interfaces that facilitate or transform information passed between Morph and Asteraceae.

These components are considered mission-critical and must meet 100% test coverage as per the code coverage requirements outlined in section 3.6.5. This classification helps ensure that all high-risk paths are thoroughly validated through automated testing.

3.6.4 CI Pipeline

flowchart TD
    A[Code Push] --> B[Static Analysis]
    B --> C[Unit Tests]
    C --> D[Integration Tests]
    D --> E[Performance Tests]
    E --> F[Security Scan]
    F --> G[Build & Package]
    G --> H[Deploy to Staging]

3.6.5 Testing Environment Configuration

EnvironmentPurposeConfigurationUpdate Frequency
DevelopmentLocal testing- Hot reloading
- Mock services
- Debug logging
Real-time
IntegrationComponent testing- Containerized services
- Test databases
- Limited resources
Per PR
StagingPre-production validation- Production-like setup
- Full monitoring
- Load testing enabled
Daily
ProductionLive deployment- High availability
- Full security
- Performance monitoring
Weekly

3.6.6 Automated Testing Requirements

  1. Code Coverage Requirements

    • Frontend (morph): 80% overall coverage
    • Backend (asteraceae): 85% overall coverage
    • Critical paths: 100% coverage

    Enforcement Mechanism in CI/CD Pipeline:

    • CI/CD jobs will run automated test coverage checks after each commit using Cloudflare and Vercel pipelines.
    • If coverage thresholds are not met, merges to the main branch will be blocked using pre-merge checks in the pipeline.
    • Automated coverage reports will be generated by the CI/CD pipeline and shared with the development team.
    • Email notifications will be sent to developers when test coverage requirements are not met, providing details on failed areas.
  2. Performance Testing Thresholds

    const performanceThresholds = {
      ttft: "500ms", // Time to First Token
      p95: "2s", // 95th percentile response time
      errorRate: "1%", // Maximum error rate
      throughput: "300", // Tokens per second
    }
  3. Testing Hooks

    # Pre-commit hooks
    pre-commit:
      - ruff check
      - eslint
      - prettier
      - type checking
     
    # Pre-push hooks
    pre-push:
      - unit tests
      - integration tests
      - build verification

3.6.7 Quality Gates

StageCriteriaAction on Failure
Pull Request- All tests pass
- Coverage thresholds met
- No security issues
Block merge
Integration- E2E tests pass
- Performance metrics met
- API contract tests pass
Notify team
Deployment- Smoke tests pass
- No regression
- Security scan clean
Rollback

3.6.8 Monitoring and Reporting

Metric TypeToolsFrequencyAlert Threshold
Test ResultsGitHub ActionsPer commitAny failure
CoverageCodecovPer PR< 80%
PerformanceGrafanaReal-timeTTFT > 500ms
SecuritySnykDailyCritical vulnerabilities

3.7 Software Validation Plan

The software validation process ensures that morph meets user requirements and functions effectively in its intended environment. This plan outlines the systematic approach to validating the system through user testing, performance analysis, and functionality verification.

3.7.1 Validation Approach

PhaseFocus AreaMethodsSuccess Criteria
Alpha TestingCore functionalityInternal testing, Developer validationBasic feature completeness
Beta TestingUser experienceLimited user group testingUser satisfaction metrics
Release CandidateProduction readinessFull system validationAll validation criteria met
Post-ReleaseProduction monitoringUser feedback, Performance metricsContinuous improvement metrics

3.7.2 User Validation Groups

GroupDescriptionSizeDurationFocus Areas
Internal UsersDevelopment team and stakeholders5-8 peopleThroughout developmentFeature completeness, Technical stability
Beta UsersSelected writers and engineers15-20 people4 weeksUser experience, Feature utility
Usability TestersUX researchers and target users10-12 people2 weeksInterface design, Accessibility
Production PilotsEarly adopters30-50 people6 weeksReal-world usage, Performance

3.7.3 Validation Scenarios

CategoryScenarioSuccess CriteriaValidation Method
Creative WritingComplete essay draft using planning features- Successful plan creation
- Effective suggestion integration
- Smooth writing flow
User observation, Feedback collection
Feature SteeringCustomize text generation with specific style- Accurate style adaptation
- Consistent outputs
- User satisfaction
Style comparison analysis
Document ManagementCreate and manage multiple writing projects- Reliable file operations
- Effective organization
- Version control accuracy
User workflow analysis
Collaborative UseMultiple users working on shared documents- Consistent performance
- Feature accessibility
- Clear user permissions
Multi-user testing

3.7.4 Validation Metrics

Metric CategoryKey IndicatorsTarget ValuesCollection Method
User Satisfaction- Net Promoter Score
- Feature usefulness rating
- Interface satisfaction
- NPS > 40
- Rating > 4/5
- >80% satisfied
User surveys, Interviews
Performance- Response time
- Feature availability
- Error rate
- TTFT < 500ms
- 99.9% uptime
- <1% error rate
Automated monitoring
Feature Adoption- Feature usage rates
- Return user rate
- Session duration
- >70% feature usage
- >60% return rate
- >15 min sessions
Analytics tracking
Writing Efficiency- Time to complete drafts
- Suggestion acceptance rate
- Planning feature usage
- 20% time reduction
- >50% acceptance
- >30% usage
User metrics, Surveys

3.7.5 Validation Process

  1. Pre-validation Setup

    • Environment preparation
    • Test data creation
    • User group selection
    • Success criteria documentation
  2. Validation Execution

    • Structured testing sessions
    • User feedback collection
    • Performance monitoring
    • Issue tracking
  3. Data Collection

    • Automated metrics gathering
    • User surveys
    • Interview sessions
    • Performance logs
  4. Analysis and Reporting

    • Metrics analysis
    • User feedback synthesis
    • Issue categorization
    • Improvement recommendations
  5. Feedback Integration and Iteration

    • Prioritize fixes from analyzed feedback
    • Assign tasks to relevant team members
    • Apply updates in future sprints
    • Re-test improved features with users
    • Document changes and close the loop

3.7.6 Validation Schedule

StageTimelineActivitiesDeliverables
PlanningWeek 1-2- Validation strategy development
- Test group formation
- Environment setup
Validation plan document
Alpha TestingWeek 3-6- Internal testing
- Core feature validation
- Performance baseline
Alpha test report
Beta TestingWeek 7-10- User group testing
- Feature refinement
- Performance optimization
Beta test report
Final ValidationWeek 11-12- Full system validation
- Documentation review
- Release preparation
Final validation report

3.7.7 Acceptance Criteria

ComponentCriteriaValidation Method
Text Editor- Text manipulation responsiveness
- File operation reliability
- Interface accessibility
- Customizable visualization workflows
Usability testing, Performance metrics, Accessibility testing
Planning Features- Suggestion relevance
- Planning workflow efficiency
- User satisfaction
- Customizable task and progress options
User feedback, Feature usage analysis, Workflow adaptability testing
Feature Steering- Style accuracy
- Generation consistency
- User control effectiveness
Style analysis, User validation, Customization testing
Performance- Response time targets
- System stability
- Resource utilization
Automated monitoring, Load testing
Security- Data privacy
- Access control
- Secure communication
Security audit, Penetration testing

3.7.8 Issue Management

PriorityResponse TimeResolution TimeEscalation Path
Critical1 hour4 hoursDevelopment lead → Project manager
High4 hours24 hoursTechnical lead → Development lead
Medium24 hours72 hoursDeveloper → Technical lead
Low48 hours1 weekDeveloper → Issue tracking

This validation plan ensures that morph meets its intended purpose of providing an effective, user-friendly environment for creative writing while maintaining high standards for performance and reliability. The plan will be reviewed and updated based on findings throughout the validation process.

4 System Tests

This section outlines tests for verifying both functional and non-functional requirements of the software, ensuring it meets user expectations and perform reliably.

4.1 Tests for Functional Requirements

This section outlines test cases for verifying the core functional requirements of morph as specified in the SRS. The tests are organized by major feature areas, with each test designed to validate specific user interactions and system behaviors. For detailed input data associated with each test case, refer to (6.2 Test Input Table).

4.1.1 Planning and Suggestion Features

These tests verify requirements FR-1 through FR-3, which specify the system’s capability to provide writing assistance through planning suggestions and model-guided steering.

4.1.1.1 Planning Suggestion Validation
  1. Test-FR-P1

    • Control: Manual
    • Initial State: Editor opened, planning mode active
    • Input: Valid prompt (e.g., “environmental sustainability”)
    • Output: Planning suggestions appear within MAX_DISPLAY_TIME
    • Test Case Derivation: Based on FR-1’s requirement for timely suggestion generation
    • How test will be performed: Enter prompt and start planning. Verify at least one suggestion appears within MAX_DISPLAY_TIME.
  2. Test-FR-P2

    • Control: Manual
    • Initial State: Editor opened, planning mode active
    • Input: Unintelligible prompt (e.g., “asdh123!@#”)
    • Output: Error message requesting input refinement
    • Test Case Derivation: Derived from FR-1’s need for robust input handling
    • How test will be performed: Enter nonsensical characters and verify that an input validation error message appears clearly instructing the user to revise their input due to unrecognized content.
  3. Test-FR-P3

    • Control: Manual
    • Initial State: Editor opened, planning mode active
    • Input: Large text block (LARGE_INPUT_SIZE) on climate change
    • Output: Condensed suggestions or length warning within LENGTH_RESPONSE_TIME
    • Test Case Derivation: Based on FR-1’s requirement for handling varied input lengths
    • How test will be performed: Paste a LARGE_INPUT_SIZE word passage and verify that the system responds within LENGTH_RESPONSE_TIME with a clear length-related warning message prompting the user to shorten the input.

4.1.2 Text Generation Features

These tests verify requirements FR-4 through FR-7, focusing on the system’s text generation and style adaptation capabilities.

4.1.2.1 Steering Validation
  1. Test-FR-S1

    • Control: Manual
    • Initial State: Editor opened, steering feature enabled
    • Input: “Didion” tone selection with prompt “reflection on modern life”
    • Output: Text suggestions matching Didion’s writing style under Lexical Constraints.
    • Test Case Derivation: Based on FR-2’s requirement for style-specific generation
    • How test will be performed: Enable steering, select “Didion” tone, and generate output using the prompt. Compare the output to the (6.3.1 Didion’s Writing Style Validation) checklist.
  2. Test-FR-S2

    • Control: Manual
    • Initial State: Editor opened, steering feature enabled
    • Input: User’s writing sample for style adaptation
    • Output: Customized suggestions within STYLE_MATCH_TIME
    • Test Case Derivation: Derived from FR-2’s requirement for personalization
    • How test will be performed: Upload a writing sample (e.g., an excerpt from a Joan Didion essay or similar source). Measure the time from upload to generation, where it must not exceed STYLE_MATCH_TIME. Then, evaluate the output using the (6.3.1 Didion’s Writing Style Validation) checklist.

4.1.3 User Interaction Features

These tests verify requirements FR-8 through FR-10, covering the system’s user interaction capabilities including feedback panels and preference management.

4.1.3.1 LTR Feedback Panel Validation
  1. Test-FR-F1

    • Control: Manual
    • Initial State: Editor opened with LTR feedback panel active, initial suggestions present
    • Input: User feedback on suggestion (tone adjustment, alternative phrasing)
    • Output: Real-time suggestion updates with clear preview
    • Test Case Derivation: Based on FR-3’s requirement for real-time feedback integration
    • How test will be performed: Generate suggestions, provide feedback, verify real-time updates and clarity.
4.1.3.2 User Preferences Validation
  1. Test-FR-TG1

    • Control: Manual
    • Initial State: Editor opened with settings panel accessible
    • Input: Preference settings (formal tone, narrative style, passive voice, high formality)
    • Output: Consistent application of preferences across generations
    • Test Case Derivation: Derived from FR-4’s requirement for persistent preference application
    • How test will be performed: Enable the desired preferences (e.g., formal tone, narrative style) in the settings panel. Generate at least three outputs using the same prompt. Evaluate formal tone consistency using the (6.3.2 Formal Tone Generation) checklist.

4.1.4 Profile Management Features

These tests verify requirements FR-5 and FR-11, focusing on user profile management and configuration persistence.

4.1.4.1 Profile Configuration Validation
  1. Test-FR-UP1

    • Control: Manual
    • Initial State: Editor opened with profile settings access
    • Input: Creative writing configuration (informal tone, narrative style)
    • Output: Successfully saved profile with retained settings
    • Test Case Derivation: Based on FR-5’s requirement for profile saving
    • How test will be performed: Create profile, save settings, verify retention.
  2. Test-FR-UP2

    • Control: Manual
    • Initial State: Profile panel with one existing profile
    • Input: Academic writing profile creation
    • Output: New profile saved without affecting existing profiles
    • Test Case Derivation: Derived from FR-5’s requirement for multiple profile support
    • How test will be performed: Create second profile, verify both profiles maintain settings.

4.1.5 Document Management Features

These tests verify requirements FR-6 and FR-8, covering document navigation and version control capabilities.

4.1.5.1 Revision Management Validation
  1. Test-FR-RM1

    • Control: Manual
    • Initial State: Document with multiple sections loaded
    • Input: Tree view navigation enabled
    • Output: Visual document map with section hierarchy
    • Test Case Derivation: Based on FR-6’s requirement for non-linear navigation
    • How test will be performed: Enable tree view, verify section visualization.
4.1.5.2 Version Control Validation
  1. Test-FR-VC1

    • Control: Manual
    • Initial State: Editor with version history
    • Input: Navigation to previous version
    • Output: Previous version display with restore/diff options
    • Test Case Derivation: Based on FR-8’s requirement for version navigation
    • How test will be performed: Access version menu, verify display and options.
  2. Test-FR-VC2

    • Control: Manual
    • Initial State: Multiple versions available
    • Input: Version reversion command
    • Output: Selected version becomes current with edit capability
    • Test Case Derivation: Based on FR-8’s requirement for version restoration
    • How test will be performed: Select revert option, verify state change.

4.1.6 Progress Tracking Features

These tests verify requirements FR-11 and FR-12, focusing on writing goals and progress monitoring.

4.1.6.1 Writing Goals Validation
  1. Test-FR-PT1

    • Control: Manual
    • Initial State: Progress tracking enabled
    • Input: Word count goal (WORD_GOAL)
    • Output: Real-time progress meter
    • Test Case Derivation: Based on FR-11’s requirement for progress tracking
    • How test will be performed: Set goal, verify meter updates.
  2. Test-FR-PT2

    • Control: Manual
    • Initial State: Tone analysis enabled
    • Input: Formal tone consistency goal
    • Output: Tone deviation alerts and suggestions
    • Test Case Derivation: Based on FR-11’s requirement for style consistency
    • How test will be performed: Write with varying tones, verify alerts.

4.1.7 Document Export Features

These tests verify requirement FR-13, covering document export capabilities.

4.1.7.1 Export Functionality Validation
  1. Test-FR-E1

    • Control: Manual
    • Initial State: Completed document ready for export
    • Input: PDF export selection
    • Output: Formatted PDF with preserved content
    • Test Case Derivation: Based on FR-13’s requirement for PDF export
    • How test will be performed: Export document, verify format retention.
  2. Test-FR-E2

    • Control: Manual
    • Initial State: Completed document ready for export
    • Input: Plain text export selection
    • Output: Raw text file without formatting
    • Test Case Derivation: Based on FR-13’s requirement for plain text export
    • How test will be performed: Export document, verify content preservation.

4.1.8 Interface Customization Features

These tests verify requirement FR-14, covering visual theme customization.

4.1.8.1 Theme Customization Validation
  1. Test-FR-VT1

    • Control: Manual
    • Initial State: Default light theme active
    • Input: Dark mode selection
    • Output: Consistent dark theme application
    • Test Case Derivation: Based on FR-14’s requirement for theme switching
    • How test will be performed: Switch theme, verify visual consistency.

4.2 Tests for Nonfunctional Requirements

4.3 Look and Feel

4.3.1 Verify Unified, Non-Intrusive, and Uncluttered Visual Design

Test-LF-A1

  • Type: Structural, Static, Manual
  • Initial State: The fully developed morph application is accessible on various devices.
  • Input/Condition: Access the application UI on different devices and screen sizes.
  • Output/Result: Confirmation that the UI is unified, non-intrusive, and uncluttered across all interfaces.
  • How test will be performed: Conduct a design review by assembling a team of UI/UX experts who will use a predefined checklist based on design guidelines. Usability testing will be conducted with target USABILITY_TEST_PARTICIPANTS representing primary user personas, followed by survey feedback analysis (6.1 Usability Survey Questions).

Test-LF-A2

  • Type: Structural, Static, Manual
  • Initial State: The application’s UI components are fully implemented.
  • Input/Condition: All UI screens and components are available for review.
  • Output/Result: Verification that standardized typography and color palettes are consistently applied.
  • How test will be performed: Conduct a UI audit using design system documentation and WCAG Contrast Checker. Manually verify font sizes, styles, and color codes across screens to ensure adherence to design standards.

4.4 Usability and Humanity

4.4.1 Assess the Planning Interface

Test-UH-EOU3

  • Type: Structural, Dynamic, Manual
  • Initial State: Planning interface is accessible.
  • Input/Condition: Use the planning interface to organize and adjust creative writing steps.
  • Output/Result: Users can effectively organize and debug their writing using the interface.
  • How test will be performed: Provide users with a writing task that requires planning. Observe their navigation and utilization of features. Collect feedback on effectiveness and usability through surveys and interviews (6.1 Usability Survey Questions).

4.4.2 Measure Onboarding Time for New Users

Test-UH-L1

  • Type: Structural, Dynamic, Manual
  • Initial State: The application is ready for first-time use.
  • Input/Condition: Provide new users with access to the application without prior instruction.
  • Output/Result: Users begin creating or editing content within ONBOARDING_TIME.
  • How test will be performed: Recruit participants unfamiliar with morph. Time their process from start to content creation. Note obstacles and gather onboarding feedback. This will then be followed by survey feedback analysis (6.1 Usability Survey Questions).

4.4.3 Verify Keyboard Navigation Accessibility

Test-UH-A2

  • Type: Structural, Dynamic, Manual
  • Initial State: Application is fully developed.
  • Input/Condition: Navigate the application using only the keyboard.
  • Output/Result: All UI components are accessible via keyboard.
  • How test will be performed: Verify focus indicators and functionality of interactive elements on the main interface. Perform common tasks without using a mouse to ensure accessibility.

4.5 Performance

4.5.1 Measure Time to First Tokens (TTFT)

Test-PR-SLR1

  • Type: Structural, Dynamic, Automatic
  • Initial State: The inference server and application are operational.
  • Input/Condition: Submit requests for suggestions and planning.
  • Output/Result: TTFT should meet a 95th percentile threshold below 500ms.
  • How test will be performed: Performance testing tools will be used to automatically simulate user requests for suggestions and planning features. The time from request submission to the receipt of the first token will be recorded. Tests will be conducted under various network conditions, including different latencies, to assess performance across typical user scenarios. The results will be compiled into a report detailing average TTFT, the 95th percentile, and any deviations to ensure the application’s responsiveness meets the specified flexible requirements.

4.5.2 Evaluate Throughput of Inference Server

Test-PR-SLR2

  • Type: Structural, Dynamic, Automatic
  • Initial State: Inference server is set up with batch processing capabilities.
  • Input/Condition: Send batched requests with the BATCH_SIZE.
  • Output/Result: Achieve approximately SUGGESTION_TOKENS throughput.
  • How test will be performed: Load testing tools will automatically send concurrent batched requests to the inference server. The number of tokens processed per second will be measured over multiple test runs. Server resource utilization including CPU, GPU, and memory will be analyzed to identify any bottlenecks. If the throughput is below the desired level, optimizations will be recommended to enhance performance.

4.5.3 Ensure Interface Contains Only Safe Content

Test-PR-SCR2

  • Type: Structural, Static, Automatic
  • Initial State: All UI elements and assets are integrated.
  • Input/Condition: Review all images, icons, and media used in the application.
  • Output/Result: Confirmation that there is no NSFW or harmful content.
  • How test will be performed: Automated image analysis tools will be used to scan all graphical assets for inappropriate content. Licenses and sources of third-party assets will be verified automatically where possible. Any content detected as unsuitable will be reviewed manually for confirmation and then replaced or removed to maintain a safe user environment.

4.5.4 Test Accuracy of Generated Text Matching User Steering

Test-PR-PAR1

  • Type: Structural, Dynamic, Automatic
  • Initial State: SAEs and steering functionalities are implemented.
  • Input/Condition: Provide steering inputs (e.g., tone, style) and generate text.
  • Output/Result: Generated text aligns with user inputs and feedback.
  • How test will be performed: Specific steering parameters will be defined, and automated scripts will generate text outputs based on these inputs. Analytical metrics like cosine similarity and stylistic analysis tools will be used to quantitatively assess the alignment between the generated text and the steering inputs. Results will be compiled to evaluate the system’s responsiveness and adjustments will be made to improve accuracy where necessary.

4.5.5 Test Deployment Strategy for Fault Tolerance

Test-PR-RFR2

  • Type: Structural, Dynamic, Automatic
  • Initial State: Application is deployed on a Kubernetes cluster.
  • Input/Condition: Simulate node or replica failures.
  • Output/Result: Deployment is recreated automatically, maintaining availability.
  • How test will be performed: Automated scripts will intentionally fail pods or nodes within the Kubernetes cluster. Monitoring tools will automatically track the system’s response and recovery time, verifying that deployments are recreated as with the fault tolerance strategy. Application availability will be checked continuously to ensure minimal impact on users.

4.5.6 Assess Asynchronous Processing of Suggestions

Test-PR-CR1

  • Type: Structural, Dynamic, Automatic
  • Initial State: System supports asynchronous suggestion processing.
  • Input/Condition: Multiple users submit suggestion requests simultaneously.
  • Output/Result: Requests are processed without significant delay or errors.
  • How test will be performed: Automated performance testing tools will simulate multiple users submitting requests concurrently. The system’s queue management and processing times will be monitored automatically to assess its ability to handle asynchronous processing. Any request drops or errors will be logged for analysis and remediation.

4.5.7 Verify Input Responsiveness

Test-PR-CR2

  • Type: Structural, Dynamic, Automatic
  • Initial State: Text manipulation features are implemented.
  • Input/Condition: Perform rapid text entry and editing operations.
  • Output/Result: No noticeable delays or lag in input response.
  • How test will be performed: Automated scripts will perform rapid text entry and editing operations while performance profiling tools measure input latency. Tests will be run on various hardware configurations and browsers automatically. If latency exceeds acceptable thresholds, code optimizations will be implemented to enhance responsiveness.

4.6 Security

4.6.1 Ensure HTTPS Encryption for All Communications

Test-SR-INT1

  • Type: Structural, Dynamic, Automatic
  • Initial State: Application and servers are set up with SSL certificates.
  • Input/Condition: Monitor network traffic during application use.
  • Output/Result: All data transmissions are encrypted using HTTPS.
  • How test will be performed: Automated security testing tools will monitor network traffic to verify that all communications use HTTPS. Attempts to access the application via unsecured HTTP will be scripted to ensure automatic redirection to HTTPS. The validity and configuration of SSL certificates will be checked automatically. Any mixed content warnings detected by browsers will be addressed promptly.

4.6.2 Implement DNS Security Measures

Test-SR-INT2

  • Type: Structural, Dynamic, Automatic
  • Initial State: DNS security configurations are in place.
  • Input/Condition: Perform DNS queries and observe responses.
  • Output/Result: DNS queries and responses are secure from tampering and spoofing.
  • How test will be performed: Automated DNSSEC testing tools will verify the implementation of DNS security measures. Simulated DNS spoofing attacks will be conducted automatically to test system resilience. Any vulnerabilities detected will be logged and remediated to protect against DNS-based attacks.

4.6.3 Validate Content Security Policies (CSP)

Test-SR-INT3

  • Type: Structural, Dynamic, Automatic
  • Initial State: CSP headers are configured on the server.
  • Input/Condition: Use the application while attempting to execute unauthorized scripts.
  • Output/Result: CSP effectively prevents XSS attacks.
  • How test will be performed: Automated security testing tools will attempt to inject malicious scripts into the application. The effectiveness of CSP in blocking these scripts will be verified automatically. CSP headers will be analyzed to ensure they are correctly configured. Any violations or weaknesses will be addressed to enhance security.

4.6.4 Verify Privacy Compliance

Test-SR-P1

  • Type: Structural, Static, Automatic
  • Initial State: Application codebase is complete.
  • Input/Condition: Review data handling processes and storage mechanisms.
  • Output/Result: Confirmation that no personal information is collected or stored.
  • How test will be performed: Automated code analysis tools will scan the codebase to identify any components that collect, process, or store personal data. Network traffic will be monitored during simulated user interactions to ensure no personal information is transmitted. Storage mechanisms like databases, local storage, and cookies will be inspected to verify they do not retain personal data. All findings will be documented, and any issues will be resolved to ensure compliance with privacy policies.

4.7 Maintainability and Support

4.7.1 Schedule and Verify Security Updates

Test-OER-MR1

  • Type: Structural, Static, Automatic
  • Initial State: Maintenance schedules are established.
  • Input/Condition: Review update logs and schedules while performing security scans.
  • Output/Result: Regular updates are performed. No outstanding vulnerabilities.
  • How test will be performed: Security updates will be verified by running npm audit at scheduled intervals to detect and address vulnerabilities in dependencies. Audit reports will be reviewed, and necessary updates will be applied to resolve critical issues. Continuous integration pipelines will integrate npm audit checks to prevent new vulnerabilities. All security updates and patches will be documented.

4.8 Compliance

4.9 Traceability Between Test Cases and Requirements

Requirement IDRequirement DescriptionTest Case ID(s)
FR1morph shall provide suggestions during the planning phase of creative writing, offering assistance in structuring ideas, topics, or themes based on user input.Test-FR-P1, Test-FR-P2, Test-FR-P3
FR2morph shall provide users with manual control over text generation, enabling them to select advanced steering options such as tone, style, or creativity level.Test-FR-S1, Test-FR-S2
FR3Users shall be able to interact with generated text through a left-to-right (LTR) feedback panel, allowing them to provide real-time feedback on model-generated suggestions.Test-FR-F1
FR4Users shall be able to set preferences for tone, style, voice, and formality, which morph will apply to all future generations of text.Test-FR-TG1
FR13morph shall allow users to export their documents in .pdf, .md (Markdown), and plain text formats, ensuring compatibility with external platforms.Test-FR-E1, Test-FR-E2
FR14morph shall allow users to customize the visual appearance of the editor by choosing from different themes, such as dark mode, light mode, and high-contrast options.Test-FR-VT1
LF-A1morph shall adopt a unified, non-intrusive, and uncluttered visual design.Test-LF-A1
LF-A2morph must implement a consistent design system across all user interfaces, involving standardized typography and color palette.Test-LF-A2
LF-S1The design of the application will be minimalist, utilizing clean lines and a monotonic color palette.Test-LF-S1
LF-S2The application must be responsive, adapting seamlessly to various device screens and orientations.Test-LF-S2
LF-S3Interactive elements such as buttons and links must contrast significantly with the background to ensure visibility and accessibility.Test-LF-S3
LF-S4The user interface should enable smooth transitions and intuitive animations across various sections and features.Test-LF-S4
LF-S5The application should include visual cues and feedback for user interactions to reinforce usability.Test-LF-S5
UH-EOU3The application shall include a planning interface to assist users in organizing and debugging their creative writing steps.Test-UH-EOU3
UH-L1New users should be able to understand basic functionalities and start creating or editing content within 10 minutes of initial use.Test-UH-L1
UH-UP1The application should utilize clear and concise language for all instructions, feedback, and user interface elements.Test-UH-UP1, Test-OER-PR2
UH-A1morph should support text resizing without loss of content or functionality.Test-UH-A1
UH-A2morph should ensure that all user interface components are accessible via keyboard navigation.Test-UH-A2
UH-A3Implement ARIA (Accessible Rich Internet Applications) attributes throughout the application.Test-UH-A3
PR-SLR1TTFT should be minimum, around 200-500msTest-PR-SLR1
PR-SLR2Throughput should be approximate 300 tokens/sec for a batch size of 4Test-PR-SLR2
PR-SCR1Suggestions must not be harmfulTest-PR-SCR1
PR-SCR2The interface must not contain harmful images or NSFW content.Test-PR-SCR2
PR-PAR1The generated text should match users’ steering directionTest-PR-PAR1, Test-CulR-CR2
PR-RFR1A notification toast must be sent to users in case inflight requests fail to complete.Test-PR-RFR1
PR-RFR2morph must implement a recreate deployment strategyTest-PR-RFR2
PR-CR1Suggestions would be run asynchronously on request.Test-PR-CR1
PR-CR2Input should not show any certain delayTest-PR-CR2, Test-OER-EPE2
PR-SER1morph inference server must include scale-to-zero and concurrency-based autoscaling.Test-PR-SER1, Test-OER-EPE2
PR-LR1Future integration with other language model architectureTest-PR-LR1, Test-OER-RIAS1
OER-EPE1morph will be able to run on different hardware environment, given it can run modern browser.Test-OER-AR1
OER-EPE2morph should have minimal increase in power consumptionTest-PR-SER1, Test-PR-CR2
OER-RIAS1morph inference server should provide an OpenAI-compatible endpoints.Test-PR-LR1
OER-PR1Secrets must be configured with certain Role-based access control (RBAC) rulesTest-SR-INT4
OER-PR2Relevant documentation should be accessible by users.Test-UH-UP1, Test-UH-L1
OER-PR3Feedback should also be included within the interfaceTest-OER-SR1
OER-RR1Release cycle must utilize current GitHub CD workflow.Test-OER-MR2
OER-RR2End-to-end tests should pass before deploying to production.Test-OER-MR2
OER-MR1Security updates must be done periodicallyTest-OER-MR1, Test-SR-IM1
OER-MR2Feature integrations must pass existing testsTest-OER-MR2
OER-AR1morph must be able to run with existing users’ environmentTest-PR-LR2
SR-INT1All communication between the client UI, backend services, and external APIs must be encrypted using HTTPS.Test-SR-INT1
SR-INT2Implement DNS security measures to ensure that DNS queries and responses are protected against tampering and spoofing.Test-SR-INT2
SR-INT3The application will use content security policies to mitigate the risk of XSS attacks.Test-SR-INT3
SR-P1The application must ensure that it does not collect or store personal information, adhering strictly to privacy by design principles.Test-SR-P1
SR-AU1Implement monitoring of interactions with external service providers to ensure their use complies with security policies and performance expectations.Test-OER-MR1
SR-IM1Employ up to date security measures to protect against known threats and vulnerabilities, including regular updates and patches to the software components.Test-OER-MR1
SR-IM2Configure the application to minimize the surface area for attacks by disabling unused services and endpoints.Test-SR-INT4
CulR-CR1English supportsTest-UH-PI1
CulR-CR3Support left-to-right (LTR) reading flowTest-UH-PI1

5 Unit Test Description

This section outlines unit tests for verifying the core modules and components of morph. The overall philosophy for test case selection is based on achieving maximum coverage of critical functionalities while prioritizing areas of high complexity or risk.

5.1 Unit Testing Scope

The unit testing scope focuses on core components of morph while excluding certain modules due to priority, ownership, or reliability considerations.

5.1.1 In-Scope Modules

  • Text editor core components including buffer management and file operations
  • Planning interface components for user interactions
  • Feature steering modules for SAE integration and activation caching
  • Version control subsystem for tracking document changes
  • Inference server request handling and response processing
  • User preference and configuration management
  • Security components including authentication and data protection

5.1.2 Out-of-Scope Modules

  • Third-party dependencies:
    • Base language models (Gemma 2, Llama 3.1) as they are externally maintained
  • Browser-specific implementations
  • Operating system specific functionalities
  • Cloud provider specific implementations

5.1.3 Testing Priority Rationale

  1. High Priority (P0)
  • Core text editing functionality
  • Feature steering components
  • Planning interface elements
  • Security-critical components
  1. Medium Priority (P1)
  • Version control features
  • Configuration management
  • Performance optimization components
  1. Low Priority (P2)
  • Extended platform support
  • Optional features
  • Cosmetic components

5.2 Tests for Functional Requirements

Omitted for now, refer to Module Guide for more information.

5.3 Tests for Nonfunctional Requirements

5.3.1 Inference Engine

Test-IE1

  • Type: Functional, Dynamic, Automatic
  • Initial State: The Inference Engine module is fully implemented and operational.
  • Input/Condition: Submit a predefined request to the inference engine.
  • Output/Result: The Time-to-First-Token (TTFT) is within 200-500 milliseconds.
  • How test will be performed: A unit test will be created to send a request to the inference engine with a specific input. The test will measure the time elapsed from the moment the request is sent to the receipt of the first token from the inference engine. The test will assert that the TTFT falls within the specified range of 200-500 milliseconds.

Test-IE2

  • Type: Functional, Dynamic, Automatic
  • Initial State: The Inference Engine module supports batch processing.
  • Input/Condition: Send batched requests with a batch size of 4.
  • Output/Result: Achieve a throughput of approximately 300 tokens per second.
  • How test will be performed: The unit test will send multiple batched requests to the inference engine, each batch containing 4 requests. It will measure the total number of tokens processed per second during the test. The test will verify that the throughput meets the specified target by asserting that the tokens processed per second are approximately 300.

Test-IE3

  • Type: Functional, Dynamic, Automatic
  • Initial State: The Harmful Content Filter within the Inference Engine is implemented.
  • Input/Condition: Provide inputs that are likely to generate harmful or inappropriate content.
  • Output/Result: The inference engine outputs are free of harmful or inappropriate content, judged with keyword and phrase blacklists, together with Regular Expressions.
  • How test will be performed: The unit test will feed the inference engine with inputs known to potentially trigger harmful content. It will analyze the outputs to ensure that no harmful or inappropriate content is present, confirming that the engine’s safety mechanisms effectively filter out undesirable content.

5.3.2 User Interface

Test-UI1

  • Type: Functional, Dynamic, Automatic
  • Initial State: The UI module is fully implemented and operational.
  • Input/Condition: Simulate rapid text input and editing actions by the user.
  • Output/Result: The UI responds without noticeable delays, ensuring a smooth user experience.
  • How test will be performed: The unit test will automate rapid typing and editing actions within the text editor component of the UI. It will measure the response time of the UI to these actions, asserting that the response time is within acceptable thresholds of less than 50 milliseconds per action, ensuring high responsiveness during intensive user interactions.

Test-UI2

  • Type: Functional, Dynamic, Automatic
  • Initial State: The notification system within the UI is implemented.
  • Input/Condition: Simulate an inflight request failure in the application.
  • Output/Result: A notification toast is displayed to the user informing about the request failure.
  • How test will be performed: The unit test will mock a failure in an inflight request by triggering an error condition in the request handling module. It will then verify that the UI displays a notification toast with a clear message informing the user of the failure and suggesting a retry or troubleshooting action.

5.3.3 Deployment Management

Test-DM1

  • Type: Functional, Dynamic, Automatic
  • Initial State: The deployment strategy is configured with recreate capabilities.
  • Input/Condition: Simulate a failure of a node or replica in the deployment environment.
  • Output/Result: The deployment is automatically recreated, maintaining application availability.
  • How test will be performed: The unit test will simulate the failure of a node or pod within the Kubernetes cluster by programmatically deleting or stopping it. It will verify that the deployment controller automatically recreates the failed components and that the application remains available during the process, ensuring robustness and fault tolerance in the deployment strategy.

5.3.4 Suggestion Processing

Test-SP1

  • Type: Functional, Dynamic, Automatic
  • Initial State: The suggestion processing module supports asynchronous operations.
  • Input/Condition: Submit multiple suggestion requests simultaneously.
  • Output/Result: All suggestions are processed correctly without errors or significant delays.
  • How test will be performed: The unit test will concurrently submit multiple suggestion requests to the processing module using asynchronous calls. It will monitor the processing of each request to ensure they are handled independently and efficiently. The test will assert that all suggestions are returned correctly and within acceptable time frames, confirming the system’s capacity to handle concurrent requests.

5.3.5 Autoscaling Mechanism

Test-AM1

  • Type: Functional, Dynamic, Automatic
  • Initial State: The autoscaling configuration is implemented in the deployment environment.
  • Input/Condition: Vary the load on the inference server to simulate high and low traffic conditions.
  • Output/Result: The inference server scales up during high traffic and scales down to zero during low traffic.
  • How test will be performed: The unit test will programmatically generate varying loads on the inference server by simulating user requests at different rates. It will monitor the number of active server instances and verify that scaling occurs in response to traffic. It will treat scaling up when request rates exceed a defined threshold, and scaling down to zero after a period of inactivity (e.g., 5 minutes without traffic), confirming that autoscaling optimizes resource usage as configured.

5.3.6 Model Integration

Test-MI1

  • Type: Functional, Static, Automatic
  • Initial State: The application is prepared to support multiple language model architectures.
  • Input/Condition: Integrate an alternative language model architecture into the system.
  • Output/Result: The application functions correctly with the new language model without errors.
  • How test will be performed: The unit test will replace the current language model with an alternative one, such as integrating a new SAE model. It will run the existing unit tests and check for compatibility issues or errors, ensuring that the application remains stable and functional with the new model.

5.3.7 Platform Compatibility

Test-PC1

  • Type: Functional, Static, Automatic
  • Initial State: The application build is configured for different operating systems.
  • Input/Condition: Build and run the application on Windows, macOS, and Linux environments.
  • Output/Result: The application installs and runs successfully without error on all supported platforms.
  • How test will be performed: The unit test will automate the build process for the application on different operating systems using cross-platform build tools. It will then execute automated tests to ensure that the application functions correctly on each platform, ensuring adaptability and support for different distribution platforms.

5.3.8 Security Components

Test-SC1

  • Type: Functional, Static, Automatic
  • Initial State: Security components and dependencies are up-to-date.
  • Input/Condition: Check for known vulnerabilities in the current dependencies.
  • Output/Result: No critical vulnerabilities are found; security updates are scheduled and performed.
  • How test will be performed: The unit test will use tools like npm audit or Snyk to scan the project’s dependencies for known security vulnerabilities. It will generate a report of any issues found and verify that updates are applied promptly, maintaining the security integrity of the application.

Test-SC2

  • Type: Functional, Static, Automatic
  • Initial State: RBAC policies are defined and implemented.
  • Input/Condition: Attempt to access secrets and resources with different user roles.
  • Output/Result: Access is granted only to roles explicitly authorized in the RBAC policy; unauthorized roles are denied access and receive a clear error indicating insufficient permissions.
  • How test will be performed: The unit test will simulate users with various roles attempting to access sensitive resources like secrets or configuration files. It will verify that only authorized roles have access, and unauthorized attempts are blocked, ensuring that secrets are protected with proper access controls.

5.3.9 Documentation Accessibility

Test-DA1

  • Type: Functional, Static, Manual
  • Initial State: Documentation is written and linked within the application.
  • Input/Condition: Navigate through the application to locate and access the documentation.
  • Output/Result: Users can easily find and access relevant documentation from the interface.
  • How test will be performed: A manual test will be conducted where the tester navigates the application’s UI to locate links to usage manuals and technical documentation. The tester will verify that the links are prominently placed, clearly labeled, and lead to the correct documentation pages. This will facilitate user understanding through accessible documentation.

5.3.10 Release and Deployment

Test-RD1

  • Type: Functional, Dynamic, Automatic
  • Initial State: The GitHub CD workflow is set up for the project.
  • Input/Condition: Commit new code changes and push to the repository.
  • Output/Result: The continuous deployment pipeline is triggered, and the application is built and deployed automatically without error.
  • How test will be performed: The unit test will simulate code changes by committing to a test branch. It will verify that the GitHub CD workflow is triggered, running automated tests and deploying the application if all tests pass. This will ensure that the release cycle utilizes the current CI/CD workflow effectively.

5.4 Traceability Between Test Cases and Modules

Test Case IDDescriptionModule
Test-FR-P1Verify planning suggestions appear within 10 seconds with valid prompt input.Planning Suggestions
Test-FR-P2Test error message response to unintelligible prompt input in planning mode.Planning Suggestions
Test-FR-P3Validate system’s handling of large prompt inputs, providing condensed suggestions or length prompt warning.Planning Suggestions
Test-FR-S1Verify text suggestions in the style of a specific author when tone steering is applied.Steering Text Generation
Test-FR-S2Test model adaptation to user-uploaded writing sample for customized suggestions within 30 seconds.Steering Text Generation
Test-FR-F1Validate real-time feedback update in LTR panel based on user’s tone and phrasing adjustments.LTR Feedback Panel
Test-FR-TG1Ensure future text generations reflect user-set preferences for tone, style, voice, and formality.User Preferences for Text Generation
Test-FR-UP1Verify the saving of a user profile with specific writing style configurations, such as creative writing.User Profile Configuration
Test-FR-UP2Test the creation and retention of multiple user profiles without overwriting.User Profile Configuration
Test-FR-RM1Verify non-linear navigation in the tree view, showing key points and sections in the document.Revision Management
Test-FR-VC1Validate navigation to a previous version in version control with viewing or restoring options.Version Control
Test-FR-VC2Test reversion to a previous version in version control, making it the current document state.Version Control
Test-FR-PT1Test real-time word count tracking with a user-set writing goal.Writing Goals and Progress Tracking
Test-FR-PT2Validate tone consistency monitoring, with suggestions when tone deviates from user’s set goal.Writing Goals and Progress Tracking
Test-FR-E1Verify successful export of a completed document in .pdf format, preserving formatting.Document Export Functionality
Test-FR-E2Test plain text (.txt) export functionality, ensuring only raw content is saved.Document Export Functionality
Test-FR-VT1Validate theme customization, specifically switching from default light mode to dark mode.Visual Theme Customization
Test-LF-A1Verify unified, non-intrusive, and uncluttered visual design.User Interface
Test-LF-A2Verify standardized typography and color palettes are consistently applied.User Interface
Test-LF-S1Validate minimalist design with a monotonic color palette.User Interface
Test-LF-S2Test responsiveness across devices and orientations.User Interface
Test-LF-S3Verify contrast of interactive elements to ensure visibility.User Interface
Test-LF-S4Assess smooth transitions and animations for intuitive experience.User Interface
Test-LF-S5Verify visual feedback for user interactions.User Interface
Test-UH-EOU1Evaluate session history feature accuracy in recording activities.User Interface
Test-UH-EOU2Test interactive review and manual acceptance of suggestions.Suggestion Processing
Test-UH-EOU3Assess the planning interface for effective user interaction.User Interface
Test-UH-PI1Verify multilingual support functionality.User Interface
Test-UH-PI2Test theme customization options for light and dark modes.User Interface
Test-UH-L1Measure onboarding time for new users to begin creating or editing content.User Interface
Test-UH-UP1Evaluate the clarity of language used in the UI.User Interface
Test-UH-A1Test text resizing functionality for accessibility.User Interface
Test-UH-A2Verify keyboard navigation accessibility for interactive components.User Interface
Test-UH-A3Implement and test ARIA attributes for screen reader compatibility.User Interface
Test-PR-SLR1Measure TTFT (Time-to-First-Token) with a 95th percentile below 500ms during requests.Inference Engine
Test-PR-SLR2Evaluate throughput of the inference server with batch processing capabilities.Inference Engine
Test-PR-SCR1Validate that suggestions are non-harmful and appropriate.Inference Engine
Test-PR-SCR2Ensure interface content contains no harmful or NSFW elements.User Interface
Test-PR-PAR1Test the accuracy of generated text matching user steering parameters.Suggestion Processing
Test-PR-RFR1Verify notification toast for inflight request failures.User Interface
Test-PR-RFR2Test deployment strategy to ensure fault tolerance and application availability.Deployment Management
Test-PR-CR1Assess asynchronous processing of multiple user requests without significant delay.Suggestion Processing
Test-PR-CR2Verify input responsiveness during rapid text entry and editing.User Interface
Test-PR-SER1Test autoscaling mechanism of inference server during varying traffic loads.Autoscaling Mechanism
Test-PR-LR1Evaluate integration with different model architectures.Model Integration
Test-PR-LR2Test packaging and execution across different operating systems.Platform Compatibility
Test-SR-INT1Ensure all communications are encrypted using HTTPS.Security Components
Test-SR-INT2Implement DNS security measures to secure queries and responses.Security Components
Test-SR-INT3Validate Content Security Policies (CSP) to prevent XSS attacks.Security Components
Test-SR-INT4Test session security with JWT and short-lived tokens.Security Components
Test-SR-P1Verify that privacy compliance is maintained with no collection of personal data.Security Components
Test-OER-MR1Schedule and verify that security updates are performed regularly.Security Components
Test-OER-MR2Ensure new feature integrations pass existing tests without regression.Release and Deployment
Test-OER-SR1Implement a user feedback loop to ensure user feedback is recorded and accessible.User Interface
Test-CompR-LR1Verify compliance with Canadian copyright laws for generated content.Documentation Accessibility
Test-CompR-LR2Ensure SOC 2 compliance for security standards of the inference server.Security Components
Test-CompR-LR3Obtain user consent before using content for inference purposes.Security Components
Test-CompR-SCR1Verify that client-server communications adhere to HTTP/1.1 standards.Release and Deployment

6 Appendix

6.1 Usability Survey Questions

  1. On a scale of 1 to 5, how would you rate the overall visual design of morph?
  2. Did you find the interface to be uncluttered and non-intrusive? Please explain.
  3. How easy was it to navigate through the application?
  4. Were the animations and transitions smooth and helpful in guiding you?
  5. Did you notice any issues with theme customization (light/dark mode)?
  6. How intuitive was the planning interface for organizing your writing?
  7. Were instructions and feedback clear and understandable?
  8. Did you encounter any difficulties when resizing text or using keyboard navigation?
  9. How satisfied are you with the responsiveness of the application?
  10. Do you have any suggestions for improving the usability of morph?

6.2 Test Input Table

Test Case IDInput DescriptionInput TypeSample InputNotes
Test-FR-P1Prompt entered into planning interfaceTextenvironmental sustainabilityUsed to test suggestion generation
Test-FR-P2Unintelligible prompt for robustness checkTextasdh123!@#Should trigger input refinement message
Test-FR-P3Large block of text for planningTextA 500+ word passage on climate changeSourced externally and uploaded for testing
Test-FR-S1Prompt with “Didion” style selectionText + StylePrompt: reflection on modern life
Style: Didion
See Checklist B.1 for style validation
Test-FR-S2Writing sample upload for personalizationText fileExcerpt from Joan Didion’s essay, e.g., The White Album (from published book or verified online source)Uploaded sample used for tone adaptation
Test-FR-F1Feedback on generated suggestionUI InteractionTone adjustment, alternate phrasingVerifies real-time feedback loop
Test-FR-TG1User-defined preferences for generationSettingsTone: formal, Style: narrative, Passive voice, High formalityApplied across generations
Test-FR-UP1Save creative writing profileSettingsInformal tone, Narrative styleVerifies profile retention
Test-FR-UP2Create and save academic writing profileSettingsFormal tone, Analytical styleEnsures multiple profiles can be saved
Test-FR-RM1Navigate document via tree viewUI InteractionEnable tree view navigationVisualizes section hierarchy
Test-FR-VC1View previous version of documentUI InteractionVersion history menuShows restore/diff options
Test-FR-VC2Revert to earlier document versionUI InteractionSelect version to revertCurrent document updates to selected version
Test-FR-PT1Set word count goal for writingNumeric InputGoal: 1500 wordsProgress meter should reflect count
Test-FR-PT2Set tone consistency goalStyle SettingFormal tone goalAlerts when tone deviates
Test-FR-E1Export document to PDFUI InteractionSelect “Export as PDF”Output must retain formatting
Test-FR-E2Export document to plain textUI InteractionSelect “Export as Plain Text”Output should be unformatted text only
Test-FR-VT1Switch from light to dark themeUI InteractionSelect “Dark Mode” in preferencesDark mode applies consistently across UI

6.3 Output Evaluation Checklists

6.3.1 Didion’s Writing Style Validation

To mark this test as pass, at least 4 of 5 criteria must be met:

  • Uses first-person narrative (e.g., “I think”, “I remember”)
  • Reflective or meditative tone with subtle emotional nuance
  • Unconventional sentence structures or rhythms (e.g., sentence fragments, loose clauses)
  • Descriptive and personal – includes vivid imagery or self-observation
  • Stylistic consistency across 3+ generated paragraphs

6.3.2 Formal Tone Generation

  • Avoids contractions (e.g., “do not” vs. “don’t”)
  • Uses domain-appropriate vocabulary
  • Maintains neutral and objective tone
  • Consistent sentence structure without slang or colloquialisms

6.4 Symbolic Parameters

The following symbolic constants are used throughout this document to improve clarity, maintainability, and flexibility. These constants replace hardcoded values in both functional and nonfunctional test cases.

Variable NameValueDescription
MAX_DISPLAY_TIME10 secondsMaximum time allowed for planning suggestions to appear
LENGTH_RESPONSE_TIME10 secondsTime limit for response to long input prompts
LARGE_INPUT_SIZE500 wordsThreshold for input size triggering condensation or warning
STYLE_MATCH_TIME30 secondsTime to generate personalized suggestions from a user sample
WORD_GOAL1500 wordsWord count goal for writing progress tracking
ONBOARDING_TIME10 minutesMaximum time expected for new users to begin content creation
SUGGESTION_TOKENS300 tokens/secDesired throughput of token generation by inference server
TTFT_THRESHOLD500 ms95th percentile threshold for Time To First Token
BATCH_SIZE4Number of concurrent requests in throughput testing
USABILITY_TEST_PARTICIPANTS10 usersNumber of participants for usability testing

6.5 Reflection

  1. One of the most successful aspects of writing this deliverable was refining and polishing the overall structure of the document. I focused on ensuring that the contents were clearly articulated, logically ordered, and stylistically consistent with our other project documents such as the SRS and Hazard Analysis. This alignment helped maintain a unified tone and made the V&V plan easier to follow. I also contributed to ensuring that terminology and references across different sections were cross-checked and synchronized, which improved clarity and cohesion.
  2. The main challenge I experienced was the sheer volume of writing involved. Due to the document’s length and depth, it was easy to lose track of structural consistency and test coverage balance across sections. It was also difficult to find the right balance between technical specificity and readability for a diverse audience that includes supervisors, peers, and external reviewers. To address this, I used collaborative writing tools and relied on peer feedback sessions to prioritize clarity and completeness. I also coordinated with team members to divide responsibilities and reduce redundancy, which helped streamline the workload and improve section quality.
  3. To complete the V&V process for morph, our team needs skills in dynamic testing to validate responsiveness and functional behavior under real user scenarios. Static analysis is also important for catching performance or security issues early, particularly in the backend. Accessibility testing is essential to ensure the UI meets WCAG standards and is fully keyboard-navigable. Familiarity with CI/CD pipelines is needed to automate testing and enforce quality gates. Finally, proficiency with tools like Jest, Playwright, Postman, axe-core, and Ruff is crucial for implementing tests efficiently across both the frontend and backend.
  4. Aaron will focus on dynamic testing by applying tools like k6 directly to components during development, allowing for real-time feedback and iteration. Waleed will strengthen his static analysis skills through tutorials on tools like Ruff and Bandit, providing structured learning for backend quality. Nebras chose to enroll in a web accessibility course to gain a comprehensive foundation before conducting audits. Lucas is using GitHub documentation and example workflows to build and tune the CI/CD pipeline directly in our repo. Across the team, we’ll combine reading tool documentation with pair programming to share knowledge and ensure consistent testing practices.

  1. While working on the V&V plan, the process of writing the non-functional requirements section went particularly well. My familiarity with usability, performance, and accessibility testing facilitated a smooth drafting process. Additionally, collaborative efforts from Aaron, Waleed, and Lucas enriched the document by incorporating their insights and ensuring that our approach was comprehensive and aligned with project goals.

  2. A significant challenge during the development of the V&V plan was aligning the diverse verification and validation approaches across different sections, which occasionally led to overlapping content or inconsistencies in test case formats. Managing these coordination issues was essential to maintain a coherent document. To resolve this, we held structured review sessions and used version control practices to ensure consistency and uniformity across sections. Regular team meetings and clearly documented responsibilities facilitated alignment, enabling the team to stay on the same page and maintain a clear direction throughout the process.

  3. To successfully complete the verification and validation of our project, the team will need to acquire a variety of knowledge and skills. For Aaron, expertise in dynamic testing knowledge is essential to ensure that the interfaces and design components meet quality standards through thorough functional and performance testing. Waleed will need static testing knowledge, focusing on tools for code analysis to detect inefficiencies or vulnerabilities in the ML inference server, ensuring robust backend support. Lucas requires an understanding of specialized tool usage such as for model validation and performance monitoring to effectively contribute to SAE interventions. I would need to acquire front-end testing and accessibility knowledge to build and validate the web-based editor’s functionality, ensuring it is intuitive and inclusive for users.

  4. To develop these essential knowledge areas, we have outlined multiple approaches for each skill and explained why each member has chosen their approach. For Aaron’s dynamic testing knowledge, he could either enroll in comprehensive courses focused on functional and performance testing or practice by applying dynamic tests to specific interface components. Aaron prefers the second approach as it provides immediate, practical application to his leadership in design work, which helping him refine testing skills relevant to our project’s needs.


    Waleed’s static testing knowledge can be strengthened through tutorials on static code analysis tools or by conducting hands-on analysis using past codebases and open-source projects. Waleed has chosen the tutorial approach for a structured learning path that will directly aid in securing the ML inference server’s integrity as he believes it will enable him to build foundational expertise efficiently.


    Lucas can gain proficiency in tool usage through workshops focusing on model validation and performance monitoring or by experimenting with these tools on controlled, smaller-scale projects. Lucas plans to attend workshops as they provide an opportunity to learn from experts and peers, making the process more collaborative and efficient. This is crucial for his role in SAE interventions.


    For my part, to develop front-end testing and accessibility knowledge, I could either enroll in courses that cover web accessibility best practices or work on self-directed projects where I conduct accessibility audits. I will pursue the course option because it offers a systematic approach to learning, ensuring I understand the principles thoroughly and can apply them effectively when developing and validating the web-based editor.


  1. Writing this deliverable a better understanding of the project’s core features and functionality. It was particularly helpful in clarifying how specific requirements, like model steering, and version control, fit within our overall vision for morph. Focusing on functional requirements also allowed me to refine my grasp of the system’s design goals and limitations, helping ensure that our planned features align with user needs and our technical scope. I also appreciated that the process provided room for collaboration with the team and supervisor, leading to more well-defined requirements.

  2. One of the main pain points I encountered was defining the functional scope accurately. There were initial uncertainties about how deeply certain features would be integrated, particularly technical aspects like the extent of model customization and maintaining performance in offline mode. It was challenging to balance the technical ambitions with practical constraints for the project’s first release. To overcome this, I initiated discussions with the team and supervisor to align our understanding and verify our priorities. These conversations clarified the project goals and identified where we could use existing frameworks versus areas that needed custom development.

  3. Our team will need a variety of skills to carry out verification and validation:

  • Dynamic Testing Knowledge (All Team Members): Understanding how to perform functional and performance testing, especially on features like model responsiveness (TTFT) and interface elements like the LTR panel.
  • Static Testing Knowledge (Aaron and Waleed): Knowledge of code analysis tools to detect vulnerabilities or inefficiencies, which will be critical for ensuring robust and secure code.
  • Front-End Testing and Accessibility (Nebras): Skills in front-end testing, including accessibility evaluations, to validate that the interface is intuitive and accessible for diverse users.
  1. Dynamic Testing Knowledge:
  • Approach 1: Enroll in online courses focused on software testing principles and tools.
  • Approach 2: Run dynamic tests on smaller sub-projects or components of morph to gain hands-on experience. All team members will apply Approach 2 by testing specific features in isolation as we build them.

Static Testing Knowledge:

  • Approach 1: Take specialized tutorials on static code analysis tools.
  • Approach 2: Practice static testing techniques on open-source projects or our previous codebases to identify common issues. Aaron and Waleed will adopt Approach 1, focusing on tutorials for code analysis. This approach will help build foundational knowledge that can later be applied during morph’s development for continuous quality assurance.

Front-End Testing and Accessibility:

  • Approach 1: Complete courses or certifications focused on web accessibility and front-end testing.
  • Approach 2: Participate in accessibility hackathons to practice identifying and solving accessibility issues in real-world projects. Nebras will pursue Approach 1 by taking a structured course in front-end accessibility. Given the importance of accessibility in our project, formal training will help Nebras systematically learn best practices and testing tools.

1. From the past examples and rubric, the expectation is easy to understand
2. Some of my parts work as a summary of other partners' content, so I need to rearrange the relevant information after other parts completion
3. The coding language demands us to do unit testing of the components with matching testing tools, like Jest for javascript and pytest for python. This matches the knowledge about static testing knowledge I learned before.
4. Reading the website for relevant testing frame work usage introduction, and walking through the coding examples of the testing these two ways are recommanded approaches to acquiring the knowledege. The second approach is persued by group members since they can used as template to construct our test and have detailed information about how to construct a unit test.

6.6 Revision

DateVersionNotes
Nov. 4 20240.0initial VnV
Feb. 3 20240.1Rev0
March 31 20250.2Rename to morph for consistency
Apr. 3 20250.4Full document revision and restructuring