Skip to main content
Background Image

GitHub's Double CLI Release: How Two AI Tools Are Reshaping Development Workflows

·2648 words·13 mins·
Pini Shvartsman
Author
Pini Shvartsman
Architecting the future of software, cloud, and DevOps. I turn tech chaos into breakthrough innovation, leading teams to extraordinary results in our AI-powered world. Follow for game-changing insights on modern architecture and leadership.

This week, GitHub released not one but two different CLI tools for AI development. Most people are focusing on the individual features. I’m seeing something bigger: a significant step toward AI becoming development infrastructure rather than just an assistant.

Here’s what actually happened: GitHub released both an update to their regular CLI (version 2.80.0) and a completely separate standalone Copilot CLI tool. Together, they represent two different but complementary approaches to AI-powered development.

This represents a meaningful shift in how we can build and maintain software.

Two Different Tools, One Big Vision
#

Let me break down what GitHub actually released:

Tool 1: GitHub CLI 2.80.0 with Agent Tasks
#

This updates the regular gh CLI you already know with new agent-task commands:

# Start a coding agent task and track it
gh agent-task create "refactor the authentication flow"

# List all your running tasks  
gh agent-task list

# Watch it work in real-time
gh agent-task view 1234 --log --follow

This solves the “black box” problem I had with the GitHub MCP server. Before, you could trigger the coding agent but had zero visibility into what it was doing. Now you can actually see the work happening and integrate it into scripts.

For the full command reference, see GitHub CLI 2.80.0 release notes.

Tool 2: Standalone Copilot CLI
#

This is completely separate. You install it with npm install -g @github/copilot and it becomes an interactive AI partner in your terminal:

# Interactive mode - have a conversation
$ copilot
> Help me find all the CSV files in this directory recursively
AI suggests: find . -name "*.csv" -type f

# Autonomous mode - one-shot commands  
$ copilot -p "create a Python script to parse log files"
# AI writes the script, asks permission, then creates the file

The Key Difference
#

GitHub CLI agent-tasks = manage long-running coding projects (like delegating work to a team member)

Copilot CLI = interactive terminal assistance (like pair programming with AI)

Here’s where it gets interesting. You can combine both:

# Use Copilot CLI to craft the perfect task description
$ copilot -p "help me write a task description for refactoring our auth system"

# Then delegate it to the coding agent
$ gh agent-task create "$(copilot -p 'write task: refactor auth system')"

# Monitor it while doing other work
$ gh agent-task view $TASK_ID --log --follow

We just went from “AI helps me code” to “AI runs my entire development process.” That’s not an incremental improvement. That’s a category shift.

The Missing Piece: Context-Aware AI That Runs Everywhere
#

To understand why this matters, you have to think about what makes these CLI releases fundamentally different from other AI development tools. It’s not that GitHub suddenly built smarter AI. OpenAI and Anthropic probably have better raw models. What’s different is that GitHub’s AI already knows your codebase.

When you call OpenAI’s API or use Claude directly, you’re starting fresh every time. You have to explain your architecture, your patterns, your naming conventions. You’re basically teaching the AI about your project from scratch with every interaction. It’s powerful, but it’s also exhausting.

GitHub’s coding agent is different because it lives in your repository. It already understands your issues, your pull requests, your workflow patterns. It knows how your team writes code. And now, with CLI access, that context-aware intelligence can work automatically in your production workflows.

Here’s what that means practically: when your monitoring system detects a performance issue, the GitHub coding agent doesn’t just get the error message. It gets your entire codebase context, recent deployments, related issues, and team patterns. When you trigger an agent-task from your CI pipeline, it’s not running generic analysis - it’s applying intelligence that already knows your specific architecture, coding standards, and business logic.

The Model Selection Catch
#

Here’s something important I discovered while testing these tools: you can only choose which AI model to use with the standalone Copilot CLI, not with the agent-task commands.

The agent-task commands are locked to whatever model GitHub has configured for their coding agent - currently Claude 4 Sonnet as of September 2025. There’s no way to switch it to GPT-5 or any other model. The standalone Copilot CLI, on the other hand, lets you pick your model by setting an environment variable before running commands.

This creates an interesting tradeoff. The agent-tasks give you AI that truly understands your specific project context, but you’re stuck with GitHub’s model choice. The standalone CLI lets you choose between Claude or GPT-5, but each conversation starts fresh without deep knowledge of your codebase.

In practice, this means you get context or you get control, but not both. For most workflows, I’d choose context over control - having AI that knows your repository is more valuable than being able to switch models. But for complex reasoning tasks where you need GPT-5’s capabilities, the standalone CLI becomes the better choice.

What the Web Interface Doesn’t Want You to Know
#

If you read GitHub’s official documentation about Copilot coding agent limitations, you’ll see statements like “You cannot change the AI model” and “You cannot integrate with external systems.” Reading this, you’d think these are fundamental technical constraints.

But the CLI releases expose these as design choices, not technical limitations. The agent-task commands let you script everything, monitor progress in real-time, and integrate with any tool that can run shell commands. The standalone Copilot CLI gives you model selection that the web interface deliberately hides.

This reveals something important about how developer tools get designed. When companies build “user-friendly” interfaces, they often hide capabilities to avoid overwhelming users. The problem is that hiding complexity also hides possibility. The web interface trains you to think of AI as a black box you occasionally visit, rather than as programmable infrastructure you can integrate into your workflows.

The CLI approach is different - it makes AI composable. Instead of protecting you from complexity, it gives you the tools to manage complexity. That’s the difference between convenient shortcuts and real automation.

Real Examples: What You Can Build When Both Tools Work Together
#

Once you have both an interactive AI assistant and a way to manage long-running coding tasks, the possibilities get wild. Here are some workflows, from beginner to advanced:

Simple Debug Session (Beginner-Friendly)
#

#!/bin/bash
# Use both tools to debug a failing test

# First, get quick guidance from Copilot CLI
copilot -p "My test is failing with 'connection timeout'. What should I check first?"

# Based on the advice, let the agent investigate and fix
gh agent-task create "Test 'user-login-test' is failing with connection timeout. \
  Check database connection, network config, and timeout settings. \
  Fix any obvious issues you find."

# Monitor the progress
gh agent-task list

Smart Performance Monitoring (Using Both Tools)
#

#!/bin/bash
# When servers get slow, use both AIs to investigate and fix
# Note: Assumes get_cpu_usage() function is defined elsewhere

while true; do
  if [ $(get_cpu_usage) -gt 80 ]; then
    echo "CPU usage high, investigating..."
    
    # First, use Copilot CLI to quickly analyze what's happening
    ANALYSIS=$(copilot -p "Help me understand what might cause CPU usage of $(get_cpu_usage)% in a web app")
    
    # Then delegate the actual investigation to the coding agent
    TASK_ID=$(gh agent-task create "CPU is at $(get_cpu_usage)%. \
      Analysis suggests: $ANALYSIS \
      Investigate recent deployments and create a fix." \
      --model gpt-5)
    
    echo "Created task $TASK_ID to investigate. Monitoring progress..."
    
    # Watch for completion and take action
    gh agent-task view $TASK_ID --log --follow | \
      grep -i "pull request" | \
      while read pr_line; do
        echo "Performance fix ready: $pr_line"
        notify-team "AI created performance fix: $pr_line"
      done
  fi
  sleep 300
done

Intelligent Code Review Pipeline
#

#!/bin/bash
# Use both tools for comprehensive code reviews

# When a new PR is created (webhook trigger)
PR_NUMBER=$1

# First, get quick insights from Copilot CLI
REVIEW_FOCUS=$(copilot -p "What should I look for when reviewing a PR for $PR_TITLE? Give me 3 key areas to focus on.")

# Then delegate the actual review to the coding agent
gh agent-task create "Review PR #$PR_NUMBER. Focus on: $REVIEW_FOCUS. \
  Look for bugs, security issues, and maintainability problems. \
  Add review comments and create follow-up tasks for any issues." \
  --model gpt-5

Development Workflow Orchestration
#

#!/bin/bash
# Complete development workflow using both tools

# Daily maintenance routine
daily_maintenance() {
  # Use Copilot CLI to plan what needs attention
  PRIORITIES=$(copilot -p "Look at our recent commits and issues. What are the top 3 maintenance tasks I should focus on today?")
  
  echo "Today's AI-suggested priorities: $PRIORITIES"
  
  # Create agent tasks for each priority
  echo "$PRIORITIES" | while IFS= read -r task; do
    if [[ -n "$task" ]]; then
      gh agent-task create "$task - make it production ready"
    fi
  done
}

# Smart test generation from failures  
monitor_production_errors() {
  tail -f /var/log/app.log | grep ERROR | while read error; do
    # Quick analysis with Copilot CLI
    TEST_STRATEGY=$(copilot -p "How should I test for this error: '$error'?")
    
    # Create comprehensive tests with coding agent
    gh agent-task create "Production error: '$error'. \
      Testing strategy: $TEST_STRATEGY \
      Write comprehensive tests to prevent this." \
      --model gpt-5
  done
}

The common pattern here? We’re moving from reactive to proactive. Instead of fixing problems after they happen, we’re building systems that think ahead and improve continuously.

More importantly, we’re combining quick AI assistance with deep AI work. Copilot CLI helps you think through problems fast. The coding agent executes the actual work. Together, they create workflows that are both intelligent and thorough.

The Economics Make Sense for Both Tools
#

Here’s something interesting about the pricing: both tools use your existing Copilot subscription and count against your monthly premium request quota. The specifics matter:

Agent-task commands: Each task counts as one premium request, regardless of complexity:

# These all cost the same: 1 request each
gh agent-task create "fix typo in README"
gh agent-task create "migrate our entire codebase to Python 3.12"  
gh agent-task create "do a full security audit and fix everything"

Copilot CLI: Each interaction (prompt) counts as one premium request:

# Each of these is 1 request
copilot -p "help me write a regex"
copilot -p "explain this error and suggest fixes"
copilot -p "create a complete monitoring dashboard"

Important pricing details:

  • Premium request quotas vary by plan (check GitHub Copilot billing docs)
  • You’re not charged per API call or line of code generated
  • Complex tasks cost the same as simple ones within each tool

This pricing model encourages ambitious automation. Don’t ration your AI usage. Don’t optimize for fewer requests. Build the automation you actually want.

Strategic insight: Use Copilot CLI for quick decisions and planning. Use agent-tasks for substantial work. This optimizes your premium request budget.

Important Limitations and Security Considerations
#

While these tools are powerful, they come with important limitations and security considerations:

Security Risks:

  • Copilot CLI can modify files and execute commands - only use in trusted directories
  • Always review AI-generated code before running it, especially in production
  • Agent-task outputs should be reviewed for security vulnerabilities before merging

Current Limitations:

  • No external integrations yet (tools work within GitHub ecosystem only)
  • Agent-tasks are repo-bound (no cross-repository context)
  • Both tools are in preview and may change significantly
  • Limited to GitHub’s model selection (you can’t use your own AI models)

Responsible Use:

  • Don’t blindly trust AI outputs - human oversight is essential
  • Start with non-critical tasks while you learn the tools’ behavior
  • Monitor your premium request quota to avoid service interruptions
  • Be mindful of sensitive data in prompts (logs may be retained)

We Just Crossed Multiple Lines We Can’t Uncross
#

Think about how AI coding tools have evolved, and what GitHub just delivered:

Phase 1: Autocomplete (AI suggests the next few characters)
Phase 2: Chat (AI answers questions and helps with tasks)
Phase 3: Interactive partnership (Copilot CLI becomes your terminal buddy)
Phase 4: Autonomous delegation (agent-tasks work independently on projects)

Most companies are still figuring out Phase 2. GitHub just delivered both Phase 3 and 4 at the same time.

That’s not incremental progress. That’s the difference between using AI tools and having AI colleagues.

# Interactive partnership
$ copilot
> I'm getting a weird database error. Help me debug it.
AI walks you through debugging step by step...

# Autonomous delegation  
$ gh agent-task create "Fix the database performance issues we just found"
AI goes away and comes back with a solution...

The combination is what makes this significant. You can brainstorm with one AI and delegate work to another. You can get instant feedback and long-term project execution. You can think fast and build thoroughly.

How Teams Will Actually Work
#

The most successful engineering teams are going to figure out how to split work between humans and AI effectively, and I think the division is becoming clearer.

Humans will still own the strategic decisions - architecture choices, priority setting, customer conversations. We’re also better at the ethical considerations and creative problem-solving when systems behave in unexpected ways. These require judgment, empathy, and the ability to see broader business context.

AI, on the other hand, is already excellent at maintaining consistency. It can keep code quality standards across a large codebase, write comprehensive test suites, monitor for security issues, and update documentation as code changes. These tasks require attention to detail and pattern recognition, but not creativity or judgment.

The interesting middle ground is where human expertise combines with AI execution. Code reviews will likely split this way: AI handles the mechanical checks for style violations and obvious bugs, while humans focus on logic, design decisions, and architectural implications. Planning becomes collaborative too - AI can suggest tasks based on codebase analysis, but humans decide priorities based on business needs.

Where This Is Really Heading
#

Here’s the part that gets me excited: we’re building systems that can improve themselves. Once AI can write code, test it, deploy it, monitor how it performs, and learn from the results, we’re not talking about tools anymore. We’re talking about software that evolves on its own.

# Imagine AI analyzing its own work
gh agent-task create "Look at all the code changes I've made this month. \
  Which ones worked well? Which ones caused problems? \
  Update your approach based on what you learned."

That’s a feedback loop that gets better over time. The AI learns from its successes and failures, just like a human developer would.

What You Should Do Right Now
#

Both tools are available today, though they’re still in preview status. Before you can use them, you’ll need a GitHub Copilot Pro+ subscription, and if you’re in an organization, make sure the CLI policy is enabled. Keep in mind that since these are preview features, they may change significantly without notice.

Getting started is straightforward - update your GitHub CLI to version 2.80.0 with gh --upgrade and install the standalone Copilot CLI with npm install -g @github/copilot. But the real strategy is in how you use them together.

Start with quick wins rather than trying to automate everything at once. Use the Copilot CLI for those daily terminal tasks you’re always googling - you’ll be surprised how much faster it is than switching to a browser. For agent-tasks, pick one annoying maintenance job you do weekly and delegate that first.

As you get comfortable, you’ll start to notice a natural rhythm emerging. The Copilot CLI becomes your thinking partner for quick questions and planning, while agent-tasks handle anything that takes more than fifteen minutes of sustained work. The real breakthrough happens when you start chaining them together - using insights from the interactive CLI to inform the work you delegate to the coding agent.

The teams that figure out this combination first are going to operate at a completely different level. They won’t just ship faster. They’ll build intelligent systems that improve themselves while the team focuses on innovation and strategy rather than maintenance and routine tasks.

Related

When CI/CD Speaks Human: A Friendly Nudge to DevOps (and Developers)
·1089 words·6 mins
The Context Engine: What Comes After We've Solved Code Generation
·827 words·4 mins
From "Toys" to "Tools": The Missing Layer Developers Actually Need
·679 words·4 mins