I spent a few hours trying to get this to work, and I couldn’t get it to produce usable results on anything except the training data, even with very simple drawings.
I noticed in the GitHub that they mention it is only around 60% reliable even on their own training data, but the image shown on the front page feels pretty misleading. I made 10 images that were very similar in complexity to the examples shown, and even after running it around 50 times on each image, not a single one worked correctly. In the rare cases where it produced something, the output was completely wrong.
This seems pretty misleading in its current state and definitely needs more work.
I’m also confused by their examples. All of them seem to be perfect renders/exports from 3D models — this is not the use case where I see it the most useful. Making a parametrized CAD model out of a hand-drawn sketch — yes, please
Neat, but I don't really see the utility. The time consuming part of CAD drawing comes from figuring out the correct dimensions of each feature, spacing, sizing, tolerances, etc., and constraining the drawing in a way so that it's easy to tweak later on- which this doesn't do at all. Maybe you could draw a 2d sketch of what you want then generate it, but you'd still have to do the hard part.
I think this is true under assumption that you know the CAD tool well. From my recent experience (I have a 3D printer), I regularly find myself in a situation where I know what I want to do, I can do measurements and I can make a sketch on a paper. Yet, making it a proper 3D model in something like FreeCAD is super tedious. I know OpenSCAD relatively well, but when it comes to something more complex I struggle a lot. The recent example, I was making a water tap for Lego duplo kitchen sink for my little one :)
So I would really appreciate a good AI/LLM tool that I can feed my sketch and parameters and it can save me hours of searching web and watching tutorials on how to extrude a circle over a curve
BTW, any existing AI tools work really well with OpenSCAD, so if you want a parametrized model that can be made out of simple shapes, I highly recommend this flow
> So I would really appreciate a good AI/LLM tool that I can feed my sketch and parameters and it can save me hours of searching web and watching tutorials on how to extrude a circle over a curve
I think this is possible, but the ‘trick’ would be translating your instructions in English into some kind of language that the CAD software understands.
I’m on a bunch of 3D printing forums, and everyone tries to describe what the finished product would LOOK like. They end up making PICTURES when what they really want is a STL file.
Two dimensions are easier to visualize then three, so let’s put it this way:
If you wanted to turn “English” into “a 2D image that’s dimensionally accurate”, you’d want to translate from “English” to “SVG.”
SVG is dimensionally accurate. JPG isn’t. The file format itself has no concept of “dimensions” only “pixels.”
I very much have this problem. I am uninterested in the art of sketching and constraining. I have spent many hours attempting and am content with the knowledge that I don't have the touch but still sometimes have problems I want to solve with 3D prints. LLMs could offer a solution, but they are bullshit machines and by their nature over promise. Drafting is not a trivial skill that we just wave a wand and magically have it automated. We'll get there eventually, but we're not there.
So why $work-1 spent so much time on this was quite logical. When you have point clouds generated from crappy head mounted cameras, you get models that are very complex.
So an active area of research is point cloud to "CAD" model (ie simplyfied, where a LACK tabl would be ~40 triangles rather than 400k)
One of those ways is to say "oh this pointcloud looks like a table, lets generate a bunch of hypothesis tables and see if they fit." One way to do that is to have a model that understands parametric CAD, and can create a number of tables with parameters that can be adjusted until it fits.
A perhaps easier way is to take a point cloud, get an image model trained on CAD models to draw models, in 2d imagery, then use something like this to get an actual model out.
Its not efficient, but it might work.
There are also lots of other cases, like automatic plagiarism, which are less good.
If you wanted to brute force it, it might be possible to have it generate a hundred outputs and then include a second pass to automatically select the generated model that most accurately resembles the expected output.
Basically leverage the randomness to create many variations, then select the most accurate variation automatically.
Terribly wasteful of time and processing power, but so is using GPU time to make pretty pictures randomly.
If you can just take a pencil and draw a piece of furniture, press a button and get a semi-decent CAD drawing to tweak, that'd be a huge tool for carpenters and such.
Not splats as such but text to polygon model and image to polygon model exist and for the use case of figurines that's fine to convert to formats for 3D printers.
I dont know that app specifically, but from all videos of different lidar and other 3d scanning tools I have seen the results are pretty bad and require a lot of sculpting after the scan. Whole point was that with few images the ML model could construct the actual 3d model for you
doesnt really help if I cant find them and I guess if I could find them so could GW and they would be taken down. Having an application you can host at home that could do the job from pictures would be awesome
This has been easy with OpenSCAD for a long time. I have made lots of cool, complex models this way. I built a repo of the prompts I use to show the llm how to do this and it includes many of the models I've created this way...
This isn't B-rep based modelling. Far from it. It builds up the feature tree and then uses geometry kernel to generate B-rep based representation. The final generator output script could probably be adjusted to generate openscad. It only supports 2d drawings containing lines, arcs and circles and extrude operation for making them 3d. Operations like fillets and chamfers which depend on intermediate B-rep model state are not supported.
I would even argue that for basic modelling majority of tools/features in CAD operate at the abstraction closer to CSG for describing what and B-rep is only treated as how. Just like good chunk of code based CAD use combination of CSG for what and triangle mesh based geometry engine for how. That's assuming you consider standard 2d->3d operations (extrude, revolve, sweep along arbitrary 2d profile) as valid primitives for CSG.
User comes into direct contact of B-rep in very specific situations: 1 doing operations like fillets/chamfers/draft/thickness based on intermediate geometry, 2 attaching sketches or other features to generated geometry or using generated edges (instead of new sketches) for guiding operations like complex sweeps, 3 surface based modelling workflows where you build up the the solid from individual faces typically including complex curved surfaces.
In case of 1 and 2 the the dependency on b-rep based representation is only marginal, in theory you could select edges in triangle mesh based underlying representation but the final result but quality of result wouldn't be as nice and TNP issues for parametric model editing would likely be even bigger than it is for existing CAD. That's not really CSG territory anymore but isn't exclusive to B-rep either, and involves a bunch of work that's outside the scope of B-rep. In non parametric mesh modellers with more destructive editing workflows like blender chamfers and fillets work fine. And if anything for reliable parametric models you often want to limit dependencies on intermediate geometry as that depends on CAD keeping track of where each edge/face originated from outside the b-rep and increases the chance of TNP issues.
3 is critical for industrial design containing large amount of complex curved surfaces like cars and other consumer products, but there are also many more technical parts where it can be completely ignored. Cad tutorials for beginner tutorials almost completely ignore this category of cad modelling. The part about not being exclusive to b-rep also applies for surface modelling part.
If you want something based on B-Rep, look at projects that use opencascade under the hood, as that is one of the only B-Rep CAD kernels available which is free and open source. Some examples would be CADQuery, CascadeStudio, or RepliCAD.
You need to use https://github.com/smurfix/buildscad which can convert OpenSCAD to Build123d, then you can export a STEP file. It has a few limitations, but for simple OpenSCAD files it can generate an equivalent representation in OpenCASCADE.
I just ask it for what I’m looking for (doing very simple “spare part” level at home 3d printing, nothing fancy or elaborate) and it gives me a starting point. Then I sometimes just edit the scad code by hand, and some times I ask the AI to revise, sometimes a mix (many iterations).
For very simple geometries it works great, but it very quickly becomes apparent that there’s a bit of a disconnect between “LLM views image” and “LLM emits scad that looks like that image” when it comes to anything non-trivial.
Still gives me a starting point I can mess with, which is great since I have zero CAD training or experience.
Tbh that sounds harder than just learning CAD, which is really not that difficult if you use a proper parametric editor - I would recommend SOLIDWORKS first. It's got the easiest UX so is ideal for learning. They actually have a vaguely reasonably priced subscription now, but IMO it's still way too much for occasional hobby use so I'd recommend just pirating it (which is easy).
Once you have learnt a bit then the only FOSS options that are worth a damn are a) SolveSpace which is quite good and light, has a slightly quirky UI (but not in a bad way) but unfortunately has some critical missing features at the moment - notably bevels/chamfers. Although I did see someone made a sloppy PR to add them so we'll see where that goes.
Or b) FreeCAD which is actually good now and fairly close to SOLIDWORKS (at least for the basic stuff you're likely to use) and has a reasonably good UX. Some rough edges still but overall it's very usable. Good enough that I reach for it instead of pirating SOLIDWORKS these days.
The basic workflow is pretty simple:
1. Make some planes, referenced from existing geometry.
2. Make sketches on the planes.
3. Extrude/revolve them (either adding or subtracting from the existing geometry).
4. Repeat until you have the right shape.
5. Add a load of chamfers to make it pretty.
From my experience people who heavily rely on LLMs are allergic to learning anything new (with the exception of learning new and improved ways to generate slop). They just 'want to get stuff done', even if it means staying in a local maximum forever.
Not OP but I just ask Claude Code to make me an openscad file. If I need changes I ask for them in plain english. If you are specific, it's not the quickest loop but it works. I usually ask it to parameterize the model enough so that I can quickly print small prototypes in my 3d printer. Once I am happy with the mini version I print the full-size model.
I'm sorry, but which ones of these are complex? I'm looking through some [1] [2] [3] of the output PNGs and they look trivial. Like, my first 3d model in Blender trivial.
In comparison, here's one of my recent designs: what I would still call a very simple case [4]. And it's not like I'm a trained mechanical engineer working commercially, this is stuff I design in my spare time as a programmer.
I wanted to see how well it performed on real pictures of parts or hand-drawn drawings, but when I tried setting up the docker image, immediately ran into all kinds of dependencies not being installed. The examples make me suspect it doesn't work well beyond images that were generated from CAD in the first place.
A Docker image is a reinvention of a program, and a Docker container is a reinvention of a process. At first they were self-contained - but so were the first programs and processes.
To the author if they happen to see this. Please kill the auto playing video. If someone is listening to something else on their phone this always takes over and interrupts.
I've seen this and other attempts like this[0] while exploring improvements for my CAD AI[1]. And I think these are potentially powerful solutions, but none of the current projects/weights have enough training (data or time) to make it work on arbitrary models. MeshCoder pretty much works only on models based of their training data. I haven't tried GenCAD but other commenters have confirmed my suspicion.
Seeing as I'm attempting to build my own CAD program in Rust I checked out the hosted website. I'm not really sure what is supposed to be working and what isn't.
I can't help but to be skeptical of one person writing ~115k LOC in 4 months which is just the Rust crates, nevermind the frontend (which is another 100k LOC!!!).
I'm curious why you decided to go with "eager" tessellation. Creating a circle immediately results in a bunch of lines which resemble a circle but would fail under tangency constraints quickly. Is this a current limitation or part of the strategy for the kernel?
As someone also working on CAD in their spare time, also tried the hosted app. I get the feeling that this is made by someone who has never opened a (professional) CAD app at all. It feels like a mix between a tool like blender and CSG-style CAD, but even all of that doesn't seem to work. Maybe this is the proof that vibe coding does not work, at least not for this kind of application.
In my own project I use LLMs very sparingly and hesitantly, but made the observation that they are not very useful on the hard parts of CAD. I expect this is because of a lack of training material. Most professional CAD applications are proprietary and books on the topic are usually sparse on implementation details. The non-BRep CAD applications such as OpenSCAD and family are probably overrepresented in the training data.
This might also explain why people's experiences with LLMs are very varied. If you stay in the happy path of CRUD web development and stuff all is nice and well, but if you start to veer off this path you get more and more challenged.
(forgot to mention, it's wired up to Claude so you can vibe CAD, like OP but with a few more steps - I'd like to train a similar model soon! I also wrote about my first stab at this https://campedersen.com/cad0)
I'm stumped by things like this. The drawing & modelling are not the difficult bit - the CAM programming is.
That seems difficult enough that I have not found an open source program to load a 3D model and allow me to set the toolpaths in a UI, never mind have an LLM generate them from the model.
Maybe I missed something, if you have the image rendering in the first place, you already (likely) have the CAD. It is a nice demo, but what is the utility?
Ideally it would tie in with an llm, no? Like you would want to be able to say something like "create a design of car suspension subject to x,y,z contrains"
So, at this point, it seems like this will work with all CAD programs, since they have yet to encounter any systems that they can't work with. More seriously, my guess would be whatever one is available for free in their lab. Kind of standard operating procedure for academic projects -- do a proof of concept, make a video that avoids known bugs, get a grade, push source to git, graduate. Good ideas come out of that... production code... eh... maybe.
More likely someone ends up in the situation that my kid did, previous graduate student's git repo is stale by 2 versions of C++, and 4 versions of ROS, and neither of the two unit tests still work after porting.
Doesn't matter. CAD models/objects are represented by a sequence of operations on a primitive or sketch. Unlike meshes, that describe the manifested resulting shape of objects in 3D programs like Blender.
So it's about the fact, that their model outputs that hierarchy of operations. The history of development, not just the result.
How does it not matter? Every CAD program is not going to have exactly the same interface and commands. I doubt for example this will for example generate and OpenSCAD text file.
Code to compute fillets and blends gets incredibly complex when multiple surfaces are involved. And when surfaces are barely intersecting, or almost coincident, all bets are off what the command will do - very much depends on the geometry kernel and the tolerances it uses whether it decides the surfaces even intersect. And if it decides they don't intersect, all downstream commands will fail. Handling tolerances is one of the hardest aspects of CAD. (It's no coincidence that most open source CAD applications always demo with the same relatively basic types of models - they just can't do truly complex CAD.)
So a simple set of operations - cube, sphere, intersect - sure that will work anywhere and will be portable across applications and makes a nice simple demo. But once you start doing any serious CAD modeling the result is kernel dependent. That's why portable CAD formats like STEP do not preserve the commands used to generate the results. And why native CAD application formats do preserve the command history but are not portable across applications.
It could be anything which is why the question was asked what it actually outputs. I had a skim through the page and code but couldn't see what the output was.
Is this Google-affiliated? The heading font is Product/Google Sans which IIRC only Alphabet is allowed to use and the entire webpage seems to be Google-style but neither of the two named researchers seem to be employed by Google?
I noticed in the GitHub that they mention it is only around 60% reliable even on their own training data, but the image shown on the front page feels pretty misleading. I made 10 images that were very similar in complexity to the examples shown, and even after running it around 50 times on each image, not a single one worked correctly. In the rare cases where it produced something, the output was completely wrong.
This seems pretty misleading in its current state and definitely needs more work.
So I would really appreciate a good AI/LLM tool that I can feed my sketch and parameters and it can save me hours of searching web and watching tutorials on how to extrude a circle over a curve
BTW, any existing AI tools work really well with OpenSCAD, so if you want a parametrized model that can be made out of simple shapes, I highly recommend this flow
I think this is possible, but the ‘trick’ would be translating your instructions in English into some kind of language that the CAD software understands.
I’m on a bunch of 3D printing forums, and everyone tries to describe what the finished product would LOOK like. They end up making PICTURES when what they really want is a STL file.
Two dimensions are easier to visualize then three, so let’s put it this way:
If you wanted to turn “English” into “a 2D image that’s dimensionally accurate”, you’d want to translate from “English” to “SVG.”
SVG is dimensionally accurate. JPG isn’t. The file format itself has no concept of “dimensions” only “pixels.”
So why $work-1 spent so much time on this was quite logical. When you have point clouds generated from crappy head mounted cameras, you get models that are very complex.
for example, if you look at a point cloud of an Ikea LACK (https://www.ikea.com/gb/en/p/lack-nest-of-tables-set-of-2-wh...) It will be massively complex. this means that when you want to perform nay kind of interaction with it, its computationally difficult (https://www.researchgate.net/publication/221064696/figure/fi...)
So an active area of research is point cloud to "CAD" model (ie simplyfied, where a LACK tabl would be ~40 triangles rather than 400k)
One of those ways is to say "oh this pointcloud looks like a table, lets generate a bunch of hypothesis tables and see if they fit." One way to do that is to have a model that understands parametric CAD, and can create a number of tables with parameters that can be adjusted until it fits.
A perhaps easier way is to take a point cloud, get an image model trained on CAD models to draw models, in 2d imagery, then use something like this to get an actual model out.
Its not efficient, but it might work.
There are also lots of other cases, like automatic plagiarism, which are less good.
Basically leverage the randomness to create many variations, then select the most accurate variation automatically.
Terribly wasteful of time and processing power, but so is using GPU time to make pretty pictures randomly.
https://github.com/cjtrowbridge/vibe-modeling
I would even argue that for basic modelling majority of tools/features in CAD operate at the abstraction closer to CSG for describing what and B-rep is only treated as how. Just like good chunk of code based CAD use combination of CSG for what and triangle mesh based geometry engine for how. That's assuming you consider standard 2d->3d operations (extrude, revolve, sweep along arbitrary 2d profile) as valid primitives for CSG.
User comes into direct contact of B-rep in very specific situations: 1 doing operations like fillets/chamfers/draft/thickness based on intermediate geometry, 2 attaching sketches or other features to generated geometry or using generated edges (instead of new sketches) for guiding operations like complex sweeps, 3 surface based modelling workflows where you build up the the solid from individual faces typically including complex curved surfaces.
In case of 1 and 2 the the dependency on b-rep based representation is only marginal, in theory you could select edges in triangle mesh based underlying representation but the final result but quality of result wouldn't be as nice and TNP issues for parametric model editing would likely be even bigger than it is for existing CAD. That's not really CSG territory anymore but isn't exclusive to B-rep either, and involves a bunch of work that's outside the scope of B-rep. In non parametric mesh modellers with more destructive editing workflows like blender chamfers and fillets work fine. And if anything for reliable parametric models you often want to limit dependencies on intermediate geometry as that depends on CAD keeping track of where each edge/face originated from outside the b-rep and increases the chance of TNP issues.
3 is critical for industrial design containing large amount of complex curved surfaces like cars and other consumer products, but there are also many more technical parts where it can be completely ignored. Cad tutorials for beginner tutorials almost completely ignore this category of cad modelling. The part about not being exclusive to b-rep also applies for surface modelling part.
It's analogous to "all squares are rectangles, but not all rectangles are squares" (squares=CSG, rectangles=BREP)
CSG by itself isn't suitable for most CAD use-cases.
What is your workflow for llm integration to openscad?
I've one shotted a light saber hilt with threaded parts and it worked flawlessly.
For very simple geometries it works great, but it very quickly becomes apparent that there’s a bit of a disconnect between “LLM views image” and “LLM emits scad that looks like that image” when it comes to anything non-trivial.
Still gives me a starting point I can mess with, which is great since I have zero CAD training or experience.
(I’m not the commenter you replied to)
Once you have learnt a bit then the only FOSS options that are worth a damn are a) SolveSpace which is quite good and light, has a slightly quirky UI (but not in a bad way) but unfortunately has some critical missing features at the moment - notably bevels/chamfers. Although I did see someone made a sloppy PR to add them so we'll see where that goes.
Or b) FreeCAD which is actually good now and fairly close to SOLIDWORKS (at least for the basic stuff you're likely to use) and has a reasonably good UX. Some rough edges still but overall it's very usable. Good enough that I reach for it instead of pirating SOLIDWORKS these days.
The basic workflow is pretty simple:
1. Make some planes, referenced from existing geometry. 2. Make sketches on the planes. 3. Extrude/revolve them (either adding or subtracting from the existing geometry). 4. Repeat until you have the right shape. 5. Add a load of chamfers to make it pretty.
In comparison, here's one of my recent designs: what I would still call a very simple case [4]. And it's not like I'm a trained mechanical engineer working commercially, this is stuff I design in my spare time as a programmer.
[1] - https://github.com/cjtrowbridge/vibe-modeling/blob/main/outp...
[2] - https://github.com/cjtrowbridge/vibe-modeling/blob/main/outp...
[3] - https://github.com/cjtrowbridge/vibe-modeling/blob/main/outp...
[4] - https://object.ceph-eu.hswaw.net/q3k-personal/fe3e54e6df604a...
Ironically the former is engineered to avoid the latter.
Docker files alone is usually full of aptgets and curlshs. Which might be why the docker image wont build.
[0]: https://daibingquan.github.io/MeshCoder/
[1]: https://grandpacad.com
I also wrote a bit about what goes into CAD apps! https://campedersen.com/tessellation
I can't help but to be skeptical of one person writing ~115k LOC in 4 months which is just the Rust crates, nevermind the frontend (which is another 100k LOC!!!).
I'm curious why you decided to go with "eager" tessellation. Creating a circle immediately results in a bunch of lines which resemble a circle but would fail under tangency constraints quickly. Is this a current limitation or part of the strategy for the kernel?
In my own project I use LLMs very sparingly and hesitantly, but made the observation that they are not very useful on the hard parts of CAD. I expect this is because of a lack of training material. Most professional CAD applications are proprietary and books on the topic are usually sparse on implementation details. The non-BRep CAD applications such as OpenSCAD and family are probably overrepresented in the training data.
This might also explain why people's experiences with LLMs are very varied. If you stay in the happy path of CRUD web development and stuff all is nice and well, but if you start to veer off this path you get more and more challenged.
That seems difficult enough that I have not found an open source program to load a 3D model and allow me to set the toolpaths in a UI, never mind have an LLM generate them from the model.
Which CAD program? I'm confused
Am I reading this right?
>Most importantly, GenCAD does not merely generate a 3D solid but also the entire CAD program.
Clue here: > Our proposed GenCAD architecture...
So, at this point, it seems like this will work with all CAD programs, since they have yet to encounter any systems that they can't work with. More seriously, my guess would be whatever one is available for free in their lab. Kind of standard operating procedure for academic projects -- do a proof of concept, make a video that avoids known bugs, get a grade, push source to git, graduate. Good ideas come out of that... production code... eh... maybe.
More likely someone ends up in the situation that my kid did, previous graduate student's git repo is stale by 2 versions of C++, and 4 versions of ROS, and neither of the two unit tests still work after porting.
Looks like you can go JSON -> step files, but not really in such a way that you can modify any of the operations.
* https://github.com/mightyhorst/DeepCAD
Doesn't matter. CAD models/objects are represented by a sequence of operations on a primitive or sketch. Unlike meshes, that describe the manifested resulting shape of objects in 3D programs like Blender.
So it's about the fact, that their model outputs that hierarchy of operations. The history of development, not just the result.
Code to compute fillets and blends gets incredibly complex when multiple surfaces are involved. And when surfaces are barely intersecting, or almost coincident, all bets are off what the command will do - very much depends on the geometry kernel and the tolerances it uses whether it decides the surfaces even intersect. And if it decides they don't intersect, all downstream commands will fail. Handling tolerances is one of the hardest aspects of CAD. (It's no coincidence that most open source CAD applications always demo with the same relatively basic types of models - they just can't do truly complex CAD.)
So a simple set of operations - cube, sphere, intersect - sure that will work anywhere and will be portable across applications and makes a nice simple demo. But once you start doing any serious CAD modeling the result is kernel dependent. That's why portable CAD formats like STEP do not preserve the commands used to generate the results. And why native CAD application formats do preserve the command history but are not portable across applications.
https://arxiv.org/abs/2603.04337 https://arxiv.org/abs/2603.05607 https://arxiv.org/abs/2605.01171
For a more detailed review: https://github.com/lichengzhanguom/LLMs-CAD-Survey-Taxonomy
"These fonts are licensed under the Open Font License. You can use them in your products & projects – print or digital, commercial or otherwise."
Then then have a trained llm that has can generate kcl to either create new parts or act as a llm assistant for changes to existing parts.
It’s neat that llms can do 3-D but I wonder how much of the problem is integration.