Architecture Overview
Understanding the internal architecture of SpiceCode provides valuable context for both users seeking to grasp its capabilities and contributors aiming to extend or modify the tool. SpiceCode is designed with a modular architecture, emphasizing separation of concerns and leveraging native components for core language processing tasks. This approach ensures self-sufficiency and allows for consistent analysis across different programming languages, much like the well-defined ecological and social structures that allow life to thrive on Arrakis.
Core Components and Workflow
The analysis process in SpiceCode follows a well-defined pipeline, involving several key modules working in concert:
-
Command Line Interface (
cli
module): This is the primary user entry point. It parses user commands (e.g.,spice analyze
,spice export
), arguments (like file paths,--all
,--format
), and options. It orchestrates the overall workflow based on the user's request and handles user interaction, including displaying results and managing language settings (usingutils.get_translation
). -
Language Detection (
utils.get_lang
): When an analysis or export command is issued for a specific file, the CLI first utilizes this utility to determine the programming language based on the file's extension (e.g.,.py
-> Python). -
Lexer Selection (
utils.get_lexer
): Based on the detected language, this utility dynamically loads and instantiates the appropriate language-specific lexer from thelexers
module (e.g.,PythonLexer
for Python). -
Lexical Analysis (
lexers
module): The selected native lexer reads the source code file character by character and transforms it into a linear sequence of tokens (e.g., keywords, identifiers, operators, literals), as defined inlexers.token
. This is the first stage of understanding the code's basic elements. -
Syntactic Analysis (
parser
module): The stream of tokens generated by the lexer is fed into the native parser (parser.parser
). The parser checks if the token sequence conforms to the grammatical rules of the specific language. If the syntax is valid, the parser constructs an Abstract Syntax Tree (AST), a hierarchical representation of the code's structure, using node definitions fromparser.ast
. -
Analysis Orchestration (
spice.analyze
- inferred): A central component, likely within thespice
package (e.g.,spice.analyze.py
), takes the generated AST. This orchestrator determines which specific analyzers from thespice/analyzers
directory are applicable and required (based on language and user flags like--all
or interactive selection). -
Metric Calculation (
spice/analyzers
module): The orchestrator invokes the selected analyzers, passing the AST to them. Each analyzer (e.g.,count_lines.py
,count_functions.py
) traverses the AST, examining specific nodes and structures to calculate its designated metric(s). -
Result Aggregation: The analysis orchestrator collects the results (metrics) returned by each individual analyzer.
-
Output/Export (
cli
module): The aggregated results are passed back to the CLI module. The CLI then formats these results for display in the terminal (potentially using translations fromutils.get_translation
) or uses specific export logic (likely withincli.commands.export
) to generate output files in the requested format (JSON, CSV, Markdown, HTML).
Key Design Principles
- Modularity: Each phase of the process (lexing, parsing, analyzing, CLI interaction) is handled by distinct modules, promoting separation of concerns.
- Native Implementation: Core language processing (lexing and parsing) is handled by custom-built components within SpiceCode, eliminating external dependencies for these critical tasks.
- Extensibility: The structure, particularly within
lexers
andspice/analyzers
, is designed to facilitate the addition of support for new languages or new analysis metrics by adding new modules following the established patterns. - Dynamic Loading: Utilities like
get_lexer
allow the system to adapt dynamically to different languages without requiring hardcoded conditional logic for every supported language in the main pipeline.
This architecture provides a robust and maintainable foundation for SpiceCode, enabling it to deliver consistent and insightful code analysis across a growing range of programming languages.