Skills Officially Comes to Codex

(developers.openai.com)

74 points | by rochansinha 5 hours ago

13 comments

andybak 47 minutes ago
Skills, plugins, apps, connectors, MCPs, agents - anyone else getting a bit lost?
[-]
- ksdnjweusdnkl21 1 minute ago
  It's like JS frameworks. Just wait until a React emerges and get up to speed with that later.
- Frost1x 36 minutes ago
  In my opinion it’s to some degree an artifact of immature and/or rapidly changing technology. Basically not many know what the best approach is, all the use cases aren’t well understood, and things are changing so rapidly they’re basically just creating interfaces around everything so you can change flow in and out of LLMs any way you may desire.
  Some paths are emerging popular, but in a lot of cases we’re still not sure even these are the long term paths that will remain. It doesn’t help that there’s not a good taxonomy (that I’m aware of) to define and organize the different approaches out there. “Agent” for example is a highly overloaded term that means a lot of things and even in this space, agents mean different things to different groups.
- maddmann 20 minutes ago
  It reminds me of llm output at scale. Llms tend to produce a lot of similar but slightly different ideas in a codebase, when not properly guided.
- not_a_toaster 36 minutes ago
  They’re all bandaids
- iLoveOncall 18 minutes ago
  All marketing names for APIs and prompts. IMO you don't need to even try to follow, because there's nothing inherently new or innovative about any of this.
freakynit 1 hour ago
I already was doing something similar on a regular basis.
I have many "folders"... each with a README.md, a scripts folder, and an optional GUIDE.md.
Whenever I arrive at some code that I know can be reused easily (for example: clerk.dev integration hat spans frontend and backend both), I used to create a "folder" of the same.
When needed, I used to just copy-paste all the folder content using my https://www.npmjs.com/package/merge-to-md package.
This has worked flawlessly well for me uptil now.
Glad we are bringing such capability natively into these coding agents.
cube2222 1 hour ago
It's so nice that skills are becoming a standard, they are imo a much bigger deal long-term than e.g. MCP.
Easy to author (at its most basic, just a markdown file), context efficient by default (only preloads yaml front-matter, can lazy load more markdown files as needed), can piggyback on top of existing tooling (for instance, instead of the GitHub MCP, you just make a skill describing how to use the `gh` cli).
Compared to purpose-tuned system prompts they don't require a purpose-specific agent, and they also compose (the agent can load multiple skills that make sense for a given task).
Part of the effectiveness of this, is that AI models are heavy enough, that running a sandbox vm for them on the side is likely irrelevant cost-wise, so now the major chat ui providers all give the model such a sandboxed environment - which means skills can also contain python scripts and/or js scripts - again, much simpler, more straightforward, and flexible than e.g. requiring the target to expose remote MCPs.
Finally, you can use a skill to tell your model how to properly approach using your MCP server - which previously often required either long prompting, or a purpose-specific system prompt, with the cons I've already described.
[-]
- hu3 1 hour ago
  Perhaps you could help me.
  I'm having a hard time figuring out how could I leverage skills in a medium size web application project.
  It's python, PostgreSQL, Django.
  Thanks in advance.
  I wonder if skills are more useful for non crud-like projects. Maybe data science and DevOps.
  [-]
  - JamesSwift 0 minutes ago
    Skills are the matrix scene where neo learns kungfu. Imagine they are a database of specialized knowledge that can an agent can instantly tap into _on demand_.
    The key here is “on demand”. Not every agent or convention needs to know kung fu. But when they do, a skill is waiting to be consumed. This basic idea is “progressive disclosure” and it composes nicely to keep context windows focused. Eg i have a metabase skill to query analytics. Within that I conditionally refer to how to generate authentication if they arent authenticated. If they are authenticated, that information need not be consumed.
    Some practical “skills”: writing tests, fetching sentry info, using playwright (a lot of local mcps are just flat out replaced by skills), submitting a PR according to team conventions (eg run lint, review code for X, title matches format, etc)
  - freakynit 1 hour ago
    Skills are not useful for single-shot cases. They are for: cross-team standardization (for LLM generated code), and reliable reusability of existing code/learnings.
  - jonrosner 1 hour ago
    you could for example create a skill to access your database for testing purposes and pass in your tables specifications so that the agent can easily retrieve data for you on the fly.
    [-]
    - derrida 12 minutes ago
      Oooooo, woah, I didn't really "get it" thanks for spelling it out a bit, just thought of some crazy cool experiments I can run if that is true.
rdli 1 hour ago
This is great. At my startup, we have a mix of Codex/CC users so having a common set of skills we can all use for building is exciting.
It’s also interesting to see how instead of a plan mode like CC, Codex is implementing planning as a skill.
[-]
- greymalik 41 minutes ago
  I’m probably missing it, but I don’t see how you can share skills across agents, other than maybe symlinking .claude/skills and .codex/skills to the same place?
  [-]
  - rdli 24 minutes ago
    Nothing super-fancy. We have a common GitHub repo in our org for skills, and everyone checks out the repo into their preferred setup locally.
    (To clarify, I meant that some engineers mostly use CC while others mostly use Codex, as opposed to engineers using both at the same time.)
stared 1 hour ago
Yes! I was raving about Claude Skills a few days ago (vide https://quesma.com/blog/claude-skills-not-antigravity/), and excited they come to Codex as well!
[-]
- derrida 4 minutes ago
  Thanks for that! You mentioned Antigravity seemed slow, I just started playing with it too (but not really given it a good go yet to really evaluate) but I had the model set to Gemini Flash, maybe you get a speed up if you do that?
jonrosner 1 hour ago
one thing that I am missing from the specification is a way to inject specific variables into the skills. If I create let's say a postgres-skill, then I can either (1) provide the password on every skill execution or (2) hardcode the password into my script. To make this really useful there needs to be some kind of secret storage that the agent can read/write. This would also allow me as a programmer to sell the skills that I create more easily to customers.
[-]
- j_bum 9 minutes ago
  I have no clue how you’re running your agents or what you’re building, but giving the raw password string to a the model seems dubious?
  Otherwise, why not just keep the password in an .env file, and state “grab the password from the .env file” in your Postgres skill?
- bavell 8 minutes ago
  > there needs to be some kind of secret storage that the agent can read/write
  Why not the filesystem?
  I would create a local file (e.g. .env) in each project using postgres, then in my postgres skill, tell the agent to check that file for credentials.
not_a_toaster 37 minutes ago
We’ve made a zero shot decision tree
mikaelaast 1 hour ago
Are we sure that unrestricted free-form Markdown content is the best configuration format for this kind of thing? I know there is a YAML frontmatter component to this, but doesn't the free-form nature of the "body" part of these configuration files lead to an inevitably unverifiable process? I would like my agents to be inherently evaluable, and free-text instructions do not lend themselves easily to systematic evaluation.
[-]
- coldtea 43 minutes ago
  >doesn't the free-form nature of the "body" part of these configuration files lead to an inevitably unverifiable process?
  The non-deterministic statistical nature of LLMs means it's inherently an "inevitably unverifiable process" to begin with, even if you pass it some type-checked, linted, skills file or prompt format.
  Besides, YAML or JSON or XML or free-form text, for the LLM it's just tokens.
  At best you could parse the more structured docs with external tools more easily, but that's about it, not much difference when it comes to their LLM consumption.
- Etheryte 1 hour ago
  The modern state of the art is inherently not verifiable. Which way you give it input is really secondary to that fact. When you don't see weights or know anything else about the system, any idea of verifiability is an illusion.
  [-]
  - mikaelaast 1 hour ago
    Sure. Verifiability is far-fetched. But say I want to produce a statistically significant evaluation result from this – essentially testing a piece of prose. How do I go about this, short of relying on a vague LLM-as-a-judge metric? What are the parameters?
    [-]
    - coldtea 40 minutes ago
      Would a structured skills file format help you evaluate the results more?
  - hu3 1 hour ago
    At least MCPs can be unit tested.
    With Skills however, you just selectively append more text to prompt and pray.
alexgotoi 32 minutes ago
At any HR conference you go, there are two overused words: AI and Skills.
As of this week, this also applies to Hacker News.
summarity 2 hours ago
See also:
Anthropic: https://www.anthropic.com/engineering/equipping-agents-for-t...
Copilot: https://github.blog/changelog/2025-12-18-github-copilot-now-...
karolcodes 2 hours ago
anyone using this in agentic workflow already? how is it?
rochansinha 5 hours ago
Agent Skills let you extend Codex with task-specific capabilities. A skill packages instructions, resources, and optional scripts so Codex can perform a specific workflow reliably. You can share skills across teams or the community, and they build on the open Agent Skills standard.
Skills are available in both the Codex CLI and IDE extensions.
[-]
- dan_wood 3 hours ago
  Thanks to Anthropic.
haffi112 2 hours ago
What are your favourite skills?
[-]
- dmd 15 minutes ago
  A very particular set of skills.
- pylotlight 1 hour ago
  nunchuck skills
  [-]
  - not_a_toaster 35 minutes ago
    The only skill that matters