Skip to main content

Architecture Overview

Understanding the internal architecture of SpiceCode provides valuable context for both users seeking to grasp its capabilities and contributors aiming to extend or modify the tool. SpiceCode is designed with a modular architecture, emphasizing separation of concerns and leveraging native components for core language processing tasks. This approach ensures self-sufficiency and allows for consistent analysis across different programming languages, much like the well-defined ecological and social structures that allow life to thrive on Arrakis.

Core Components and Workflow

The analysis process in SpiceCode follows a well-defined pipeline, involving several key modules working in concert:

  1. Command Line Interface (cli module): This is the primary user entry point. It parses user commands (e.g., spice analyze, spice export), arguments (like file paths, --all, --format), and options. It orchestrates the overall workflow based on the user's request and handles user interaction, including displaying results and managing language settings (using utils.get_translation).

  2. Language Detection (utils.get_lang): When an analysis or export command is issued for a specific file, the CLI first utilizes this utility to determine the programming language based on the file's extension (e.g., .py -> Python).

  3. Lexer Selection (utils.get_lexer): Based on the detected language, this utility dynamically loads and instantiates the appropriate language-specific lexer from the lexers module (e.g., PythonLexer for Python).

  4. Lexical Analysis (lexers module): The selected native lexer reads the source code file character by character and transforms it into a linear sequence of tokens (e.g., keywords, identifiers, operators, literals), as defined in lexers.token. This is the first stage of understanding the code's basic elements.

  5. Syntactic Analysis (parser module): The stream of tokens generated by the lexer is fed into the native parser (parser.parser). The parser checks if the token sequence conforms to the grammatical rules of the specific language. If the syntax is valid, the parser constructs an Abstract Syntax Tree (AST), a hierarchical representation of the code's structure, using node definitions from parser.ast.

  6. Analysis Orchestration (spice.analyze - inferred): A central component, likely within the spice package (e.g., spice.analyze.py), takes the generated AST. This orchestrator determines which specific analyzers from the spice/analyzers directory are applicable and required (based on language and user flags like --all or interactive selection).

  7. Metric Calculation (spice/analyzers module): The orchestrator invokes the selected analyzers, passing the AST to them. Each analyzer (e.g., count_lines.py, count_functions.py) traverses the AST, examining specific nodes and structures to calculate its designated metric(s).

  8. Result Aggregation: The analysis orchestrator collects the results (metrics) returned by each individual analyzer.

  9. Output/Export (cli module): The aggregated results are passed back to the CLI module. The CLI then formats these results for display in the terminal (potentially using translations from utils.get_translation) or uses specific export logic (likely within cli.commands.export) to generate output files in the requested format (JSON, CSV, Markdown, HTML).

Key Design Principles

  • Modularity: Each phase of the process (lexing, parsing, analyzing, CLI interaction) is handled by distinct modules, promoting separation of concerns.
  • Native Implementation: Core language processing (lexing and parsing) is handled by custom-built components within SpiceCode, eliminating external dependencies for these critical tasks.
  • Extensibility: The structure, particularly within lexers and spice/analyzers, is designed to facilitate the addition of support for new languages or new analysis metrics by adding new modules following the established patterns.
  • Dynamic Loading: Utilities like get_lexer allow the system to adapt dynamically to different languages without requiring hardcoded conditional logic for every supported language in the main pipeline.

This architecture provides a robust and maintainable foundation for SpiceCode, enabling it to deliver consistent and insightful code analysis across a growing range of programming languages.