Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Codegen – OSS Python Library for Advanced Code Manipulation (codegen.com)
15 points by _jayhack_ on Jan 29, 2025 | hide | past | favorite | 3 comments
Hey HN! We've just open-sourced Codegen (https://github.com/codegen-sh/codegen-sdk), a Python library for manipulating Python + JS/React codebases.

Codegen was engineered backwards from real-world, large-scale codebase analysis + refactors we performed on multi-million-line enterprise codebases. It provides a scriptable interface to a powerful, multi-lingual language server built on Tree-sitter.

We realized that many code transformation tasks that impact large teams - refactors, enforcing patterns, analyzing control flow - are fundamentally programmatic operations. Yet existing tools like LibCST and Jscodeshift often require you to think in terms of ASTs and parser internals rather than the high-level changes you want to make.

Therefore, we built Codegen to match how developers actually think about code changes:

  # Move a symbol to a new file
  # Handles imports, references, dependencies
  function.move_to_file("new_file.py") 

  # Rename across the codebase
  class_def.rename("NewName")  # Updates all usages, preserves formatting

  # Analyze call patterns
  for usage in function.usages:
      print(f"Used in {usage.file.name}")

Codegen handles the edge cases automatically - updating imports, preserving dependencies, maintaining references, and resolving naming conflicts. You focus on intent, we handle the details.

Under the hood, Codegen performs static analysis to build a rich graph representation of your code. This enables:

- Versatile and comprehensive operations

- Built-in visualization capabilities

- Blazing fast execution of large-scale refactors

We've seen a wide variety of advanced code manipulation programs emerge, including:

- Mining codebases for LLM pre-training data

- Analyzing security vulnerabilities

- Large-scale API migrations

- Enforcing code patterns

We're excited to share this with the community and look forward to your feedback. Give it a spin and let us know what you think!

  uv tool install codegen
  codegen notebook --demo

Docs: https://docs.codegen.com GitHub: https://github.com/codegen-sh/codegen-sdk Community: https://community.codegen.com

Let us know if you have any questions or interesting use cases you'd like to explore.



Man, at first glance the documentation looks so good.

I’ve been meaning to build a PoC for directly manipulating symbols instead of text with the idea to eventually eliminate the possibility of syntax errors.

The task always looked one step too big for me to be worth it - the foundation for programmatically manipulating code seemed to be missing, maybe Roslyn fits the bill, but C# isn’t interesting to me ecosystem wise.

It seems like this may be what I was waiting for - pretty cool!


How does this compare to Codemodder? Can it be used for transpilation?


Codemodder has extensive Java support, which Codegen does not support at the moment. Otherwise, my understanding of Codemodder is that it is focused on AST-level syntactical modifications. Codegen computes a richer graph datastructure, and this can be used for sophisticated modifications that depend on inheritance hierarchies, function usages, cross-file references and more.

Codemodder is written in Java, whereas you can write Codegen in a jupyter notebook or anywhere you can run Python.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: