Breaking the Syntax Barrier: Building Sakshyar, Nepal's First Nepali Programming Language
The journey of creating Sakshyar, Nepal's first programming language in Nepali, from tokenization to code generation. Because programming shouldn't be limited by language barriers.
During my third year at my college days, I found myself preoccupied with a question that felt both radical and necessary: Why should the mastery of logic be gated by the mastery of English?
This project didn’t start in a vacuum. The initial spark came from a conversation with my good friend Nirav. He had the brilliant vision of localized syntax, and his excitement was infectious. It made me realize that while programming is universal, its entry point isn’t. This led to the creation of Sakshyar: a college project experiment, not a production-level compiler, but a proof-of-concept to explore whether Nepali syntax could lower the barrier to entry for programming.
The Philosophy of Sakshyar
I couldn’t help but wonder: when I watched juniors struggle, was it really about them lacking “computer brains,” or was something else at play? Maybe they weren’t struggling with logic itself, but with a double translation—first translating a real-world problem into logic, then that logic into English keywords like while, function, and return. I wasn’t certain, but the hypothesis felt worth testing.
The experiment was straightforward: build a minimal compiler that transpiled Nepali keywords to C, and see if it made a difference. In truth, we never got to properly verify whether localized syntax actually helped students learn faster or think more clearly. The project scope was limited, and we lacked the resources for a proper study. But what we did learn was invaluable—not about whether the hypothesis was right or wrong, but about how compilers work, how languages are structured, and how much complexity hides behind the syntax we take for granted.
The Grammar of the Soil
In this version of Sakshyar, we moved closer to the natural cadence of the Nepali language. We introduced strict typing and specific control structures that feel intuitive.
Core Keywords & Data Types
| Concept | Sakshyar Keyword | Usage |
|---|---|---|
| Numbers/Decimals | अंक |
अंक संख्या = ५ |
| Strings | अक्षर |
अक्षर नाम = "Sammelan" |
| Function | परिभाषा |
Defining a block of logic |
| Return | फिर्ता |
Exiting a function with a value |
| Output | लेख |
Predefined function for standard output |
| Input | ल्याउ |
Predefined function for standard input |
Logic & Control Flow
Our syntax uses the गर (Do) keyword to close blocks, making it read like a set of instructions rather than abstract symbols.
Conditionals
यदि k == 5 भए
// logic here
नत्र k == 6 भए
// logic here
नत्र
// fallback logic
गर
Loops
Sakshyar supports both “Until” and “While” logic, matching how we speak:
Until: K == 5 नभए सम्म (Until K is 5)
While: K < 5 भए सम्म (While K is less than 5)
Break: निस्क (Exit)
Continue: जाउ (Go)
Compiler Architecture: Under the Hood
Building a compiler from scratch is a humbling exercise in detail. You aren’t just writing code; you are building the machine that understands code.
-
Lexical Analysis (The Tokenizer) The first challenge was Unicode. Nepali characters are complex; a single visible character like “फ” can sometimes be stored as multiple Unicode points. Our Lexer had to normalize these into a stream of tokens before the compiler could even begin to “read.”
-
Syntax Analysis (Parsing) Once we had a stream of tokens, the next step was to understand their structure. We implemented a Recursive Descent Parser that reads tokens and builds a parse tree according to our language’s grammar. To formalize this grammar, we defined Sakshyar in BNF (Backus-Naur Form), which served as our blueprint—ensuring that every
यदिeventually found its matchingगर, and that expressions were properly nested and evaluated.
Here’s the complete grammar specification:
| Rule Name | Production Rule |
|---|---|
| Program | Function* EOF |
| Function | "परिभाषा" "(" Parameters? ")" IDENTIFIER Block "गर" |
| Parameters | Type IDENTIFIER ("," Type IDENTIFIER)* |
| Type | "अंक" | "अक्षर" |
| Statement | IfStmt | WhileStmt | UntilStmt | ReturnStmt | PrintStmt | InputStmt | CallStmt | BreakStmt | ContinueStmt |
| Block | Statement* |
| IfStmt | "यदि" Expression "भए" Block ("नत्र" Expression "भए" Block)* ("नत्र" Block)? "गर" |
| WhileStmt | Expression "भए सम्म" Block "गर" |
| UntilStmt | Expression "नभए सम्म" Block "गर" |
| ReturnStmt | Expression "फिर्ता" |
| PrintStmt | (Expression ("," Expression)*)? IDENTIFIER? "लेख" |
| InputStmt | IDENTIFIER "ल्याउ" |
| Break/Cont | "निस्क" | "जाउ" |
| Expression | Equality |
| Equality | Comparison (("==" | "!=") Comparison)* |
| Comparison | Term ((">" | ">=" | "<" | "<=") Term)* |
| Term | Factor (("+" | "-") Factor)* |
| Factor | Unary (("*" | "/") Unary)* |
| Unary | ("!" | "-") Unary | Primary |
| Primary | NUMBER | STRING | IDENTIFIER | "(" Expression ")" | CallStmt |
| CallStmt | (Expression ("," Expression)*)? IDENTIFIER |
- Code Generation (The Bridge to C) To ensure Sakshyar was fast and portable, I chose C as the target language. The Sakshyar compiler reads Nepali, builds an Abstract Syntax Tree (AST), and then “transpiles” it into optimized C code.
Implementation Example: The Addition Function
Here is how a standard function and call looks in Sakshyar:
Function Definition
// definition of addition
परिभाषा (अंक ka, अंक kha) जोड
अंक ga = ka + kha
ga फिर्ता
गर
Function Call & Output
// Call with parameters
५, ६ जोड
// Standard output
"नमस्ते!" लेख
Challenges and Solutions
Identifier Translation: C compilers don’t like Unicode identifiers. I implemented a transliteration system that converts Nepali variable names into valid ASCII equivalents behind the scenes.
Semantic Logic: Implementing नभए सम्म (Until) required a logical inversion in the compiler logic to map correctly to C’s while(!condition) structure.
Standard I/O: Making लेख (write) and ल्याउ (read) feel like keywords while they were actually wrappers around printf and scanf required careful handling of variadic arguments.
Impact & Reflection
Sakshyar was showcased at mnsa.cc and received positive feedback from the Nepali developer community, though it remained a proof-of-concept rather than a tool anyone would use in production.
Building this taught me that compiler design is 70% planning and 30% implementation. It demystified the “magic” of programming languages. Now, when I use a high-level language, I don’t just see syntax; I see the lexer, the parser, and the memory management happening underneath. The hypothesis about localized syntax helping students might remain unproven, but the journey of building Sakshyar was its own reward—a deep dive into the machinery that makes programming languages possible.
I am deeply grateful to Nirav for the initial spark. This project, while experimental, made me question assumptions about who programming languages are for and how they should be designed.
References
- Crafting Interpreters by Bob Nystrom - Excellent compiler design book
- LLVM Tutorial - Modern compiler backend
- Sakshyar Source Code - Full implementation
- Project Website - Demo and documentation