What is ACE-Step v1.5?

ACE-Step v1.5 is a highly efficient open-source music foundation model for commercial-grade text-to-music generation, running locally on consumer hardware with ultra-fast inference and full song creation capabilities.

When was ACE-Step v1.5 released?

It was released on January 31, 2026, with the technical paper published on arXiv (2602.00744) and models/weights on Hugging Face.

Is ACE-Step v1.5 free to use?

Yes, fully free and open-source under MIT license with model weights, code, and demos available—no usage fees or subscriptions required.

What hardware does ACE-Step v1.5 need?

Runs locally with less than 4GB VRAM; generates full songs under 10s on RTX 3090 or equivalent, under 2s on A100 GPU.

Does ACE-Step v1.5 support lyrics and vocals?

Yes, it generates complete songs with lyrics, vocals, and structure from text prompts, plus multilingual support in 50+ languages.

Can I customize ACE-Step v1.5 styles?

Yes, lightweight LoRA training allows personalization from just a few songs to capture custom artist or genre styles.

What editing features does ACE-Step v1.5 have?

Includes cover generation, repainting sections, vocal-to-BGM conversion, track extraction, and seamless long-form composition.

How does ACE-Step v1.5 compare to Suno or Udio?

It achieves comparable or better quality on metrics, runs fully local/offline, is open-source/free, and supports commercial use with compliant data.

ACE-Step v1.5

From ACE Studio and StepFun

Ultra-Fast Open-Source Music Foundation Model – Commercial-Grade Text-to-Music Generation on Consumer Hardware

Audio & Music

31 Jan 2026

N/A

0.0

Pricing Model

Free

Starting Price

$0/Month

👁 61

About This AI

ACE-Step v1.5 is a highly efficient open-source music foundation model developed by ACE Studio and StepFun, designed to deliver commercial-grade music generation locally on consumer hardware.

It combines a Language Model (LM) as an omni-capable planner with a Diffusion Transformer (DiT) for audio synthesis, using Chain-of-Thought to create detailed song blueprints including metadata, lyrics, and captions from simple text prompts.

The model supports full song generation from short loops to 10-minute compositions, multilingual prompts in over 50 languages, strict prompt adherence, and versatile editing like cover generation, repainting, vocal-to-BGM conversion, and track extraction.

Trained on licensed, royalty-free, and synthetic data for legal compliance, it achieves ultra-fast inference: under 2 seconds per full song on A100 GPU, under 10 seconds on RTX 3090, with less than 4GB VRAM required.

Variants include base (medium quality, high diversity), SFT (high quality, medium diversity), turbo (very high quality, medium diversity), and upcoming turbo-rl.

It enables lightweight personalization via LoRA training from just a few songs to capture custom styles.

Released under MIT license with full weights, inference code, and demos on Hugging Face, it’s ideal for music artists, producers, content creators, and developers seeking powerful, fast, ethical local music AI without cloud dependency.

Audio Editor Text To Audio Transcript Voice Cloning

Key Features

Hybrid LM + DiT Architecture: Language Model plans song structure via Chain-of-Thought, DiT handles high-fidelity audio synthesis
Full Song Generation: Creates complete tracks from short loops to 10-minute compositions with metadata, lyrics, and captions
Multilingual Prompt Support: Strict adherence across 50+ languages for global creators
Ultra-Fast Inference: Under 2s on A100, under 10s on RTX 3090 for full songs; low VRAM (less than 4GB)
Editing Capabilities: Cover generation, repainting, vocal-to-BGM conversion, track extraction
LoRA Personalization: Train custom style LoRAs from just a few songs for unique sound
High Quality Variants: Base (diverse), SFT (high quality), turbo (very high quality, fast), turbo-rl upcoming
Commercial Compliance: Trained on licensed/royalty-free/synthetic data for legal use
Local Deployment: Runs fully offline on consumer GPUs with Hugging Face Transformers/Diffusers
Demo and Playground: Hugging Face Spaces for no-install testing and generation

Price Plans

Free ($0): Full open-source access to all model weights, inference code, LoRA training, and demos under MIT license with no usage fees
Cloud/Hosted (Paid via third-parties): Optional API or hosted inference through platforms like WavespeedAI or ComfyUI services with token-based pricing

Pros

Extremely fast local generation: Full songs in seconds on mid-range hardware, no cloud needed
Commercial-grade quality: Outperforms many proprietary models on metrics with ethical training data
Versatile editing toolkit: Unified support for covers, repaints, vocal isolation/conversion
Lightweight customization: Easy LoRA training for personal or artist-specific styles
Fully open-source: MIT license with weights, code, paper, and demos freely available
Strong multilingual performance: Excellent prompt following in 50+ languages
Low resource requirements: Runs on consumer GPUs with minimal VRAM
Rapid inference variants: Turbo models enable even faster creation without quality loss

Cons

Recent release: Community integrations and fine-tuning examples still emerging
Requires GPU for best speed: CPU inference possible but much slower
Complex setup for advanced use: Needs proper environment (Transformers, Diffusers, ROCm/CUDA)
Variable prompt adherence: Some users report inconsistencies in complex instructions vs demos
Limited to music/audio: Focused on generation/editing, not general audio tasks
No built-in UI beyond demos: Relies on code or third-party frontends like ComfyUI
Potential artifacts in long tracks: 10-minute compositions may need careful prompting

Use Cases

Music production prototyping: Quickly generate full tracks or loops for ideas and demos
Content creation: Produce background music, jingles, or soundtracks for videos/podcasts
Personalized music: Train LoRAs on favorite artist styles for custom generations
Editing existing audio: Convert vocals to instrumental, repaint sections, or create covers
Multilingual songwriting: Generate lyrics-aware music in native languages
Game/film scoring: Fast iteration on ambient, thematic, or cinematic cues
Creative experimentation: Explore genres, moods, or hybrid styles locally

Target Audience

Music producers and artists: Needing fast local tools for creation and editing
Content creators and YouTubers: Generating royalty-free music for videos
Indie game developers: Creating custom soundtracks without licensing issues
AI music enthusiasts: Experimenting with open-source models and LoRAs
Developers and researchers: Building or studying music AI pipelines
Commercial users: Seeking ethical, compliant AI for production workflows

How To Use

Install dependencies: pip install transformers diffusers torch accelerate
Load model: Use from_pretrained('ACE-Step/Ace-Step1.5') with appropriate variant (base/sft/turbo)
Generate music: Provide text prompt + optional lyrics; set steps (50 for base, 8 for turbo), CFG scale
Run inference: Call pipeline(prompt) for audio output; save as WAV/MP3
Train LoRA: Use provided scripts with few reference songs to fine-tune style
Use demos: Try Hugging Face Spaces playground for no-code generation
Integrate ComfyUI: Install custom nodes for visual workflow and faster iteration

How we rated ACE-Step v1.5

Performance: 4.9/5
Accuracy: 4.7/5
Features: 4.8/5
Cost-Efficiency: 5.0/5
Ease of Use: 4.5/5
Customization: 4.9/5
Data Privacy: 5.0/5
Support: 4.6/5
Integration: 4.7/5
Overall Score: 4.8/5

ACE-Step v1.5 integration with other tools

Hugging Face Transformers/Diffusers: Native support for easy loading and inference in Python scripts or notebooks
ComfyUI: Custom nodes available for visual node-based workflow and faster music generation pipelines
LoRA Training Tools: Built-in support for lightweight fine-tuning with tools like Kohya or custom scripts
Audio Editors: Export WAV/MP3 files compatible with DAWs like Ableton Live, FL Studio, Logic Pro, or Audacity
Third-Party Frontends: Integration with local UIs like Automatic1111-style interfaces or custom music apps via API wrappers

Best prompts optimised for ACE-Step v1.5

Energetic EDM festival anthem with heavy bass drops, soaring synth leads, female vocal chops saying 'feel the rhythm', build-up to massive drop at 32s, crowd cheers, 128 BPM, high energy, festival vibe
Melancholic lo-fi jazz hip-hop beat, rainy night city vibes, soft piano chords, gentle saxophone solo, vinyl crackle, slow 85 BPM, nostalgic mood, chillhop style, instrumental only
Epic orchestral cinematic trailer music, powerful strings and brass swells, thunderous percussion, choir chanting in Latin, dramatic tension build to heroic climax, Hans Zimmer style, 100 BPM
Upbeat K-pop idol track with catchy chorus, bubbly synths, strong 4-on-the-floor beat, female vocals in Korean about summer love, bright and fun, 140 BPM, dance pop energy
Dark trap beat with deep 808s, eerie bells, aggressive hi-hats, male rap verses about street life, auto-tuned hook, moody atmosphere, 140 BPM, modern hip-hop trap

ACE-Step v1.5 revolutionizes open-source music AI with blazing-fast local generation, commercial-quality output, and ethical training data. Its hybrid LM-DiT design delivers coherent full songs, multilingual support, and powerful editing in under 10s on consumer GPUs. Ideal for creators seeking Suno-level results offline and free, highly recommended for music production.

FAQs

What is ACE-Step v1.5?
ACE-Step v1.5 is a highly efficient open-source music foundation model for commercial-grade text-to-music generation, running locally on consumer hardware with ultra-fast inference and full song creation capabilities.
When was ACE-Step v1.5 released?
It was released on January 31, 2026, with the technical paper published on arXiv (2602.00744) and models/weights on Hugging Face.
Is ACE-Step v1.5 free to use?
Yes, fully free and open-source under MIT license with model weights, code, and demos available—no usage fees or subscriptions required.
What hardware does ACE-Step v1.5 need?
Runs locally with less than 4GB VRAM; generates full songs under 10s on RTX 3090 or equivalent, under 2s on A100 GPU.
Does ACE-Step v1.5 support lyrics and vocals?
Yes, it generates complete songs with lyrics, vocals, and structure from text prompts, plus multilingual support in 50+ languages.
Can I customize ACE-Step v1.5 styles?
Yes, lightweight LoRA training allows personalization from just a few songs to capture custom artist or genre styles.
What editing features does ACE-Step v1.5 have?
Includes cover generation, repainting sections, vocal-to-BGM conversion, track extraction, and seamless long-form composition.
How does ACE-Step v1.5 compare to Suno or Udio?
It achieves comparable or better quality on metrics, runs fully local/offline, is open-source/free, and supports commercial use with compliant data.

Newly Added Tools

Qodo AI

Code & Development

$0/Month

Codiga

Code & Development

$10/Month

Tabnine

Code & Development

$59/Month

CodeRabbit

Code & Development

$0/Month

ACE-Step v1.5 Alternatives

Synthflow AI

Audio & Music

$0/Month

Fireflies

Audio & Music

$10/Month

Notta AI

Audio & Music

$9/Month

Latest AI News

ACE-Step v1.5 Reviews

0.0

0.0 out of 5 stars (based on 0 reviews)

Excellent0%

Very good0%

Average0%

Poor0%

Terrible0%

There are no reviews yet. Be the first one to write one.

ACE-Step v1.5

About This AI

Key Features

Price Plans

Pros

Cons

Use Cases

Target Audience

How To Use

How we rated ACE-Step v1.5

ACE-Step v1.5 integration with other tools

Best prompts optimised for ACE-Step v1.5

FAQs

What is ACE-Step v1.5?

When was ACE-Step v1.5 released?

Is ACE-Step v1.5 free to use?

What hardware does ACE-Step v1.5 need?

Does ACE-Step v1.5 support lyrics and vocals?

Can I customize ACE-Step v1.5 styles?

What editing features does ACE-Step v1.5 have?

How does ACE-Step v1.5 compare to Suno or Udio?

Newly Added Tools

Qodo AI

Codiga

Tabnine

CodeRabbit

Synthflow AI

Fireflies

Notta AI

Latest AI News

Cursor Unveils Composer 1.5: Major Boost for Handling Complex Coding Challenges

OpenAI starts to roll out a test for ads in ChatGPT today: Take a look at the new UI

Grok Climbs to #3 Rank in Global AI Traffic Rankings While Dominating Trading Benchmarks

ACE-Step v1.5 Reviews

ACE-Step v1.5

From ACE Studio and StepFun

About This AI

Key Features

Price Plans

Pros

Cons

Use Cases

Target Audience

How To Use

How we rated ACE-Step v1.5

ACE-Step v1.5 integration with other tools

Best prompts optimised for ACE-Step v1.5

FAQs

What is ACE-Step v1.5?

When was ACE-Step v1.5 released?

Is ACE-Step v1.5 free to use?

What hardware does ACE-Step v1.5 need?

Does ACE-Step v1.5 support lyrics and vocals?

Can I customize ACE-Step v1.5 styles?

What editing features does ACE-Step v1.5 have?

How does ACE-Step v1.5 compare to Suno or Udio?

Newly Added Tools​

Qodo AI

Codiga

Tabnine

CodeRabbit

Synthflow AI

Fireflies

Notta AI

Latest AI News

Cursor Unveils Composer 1.5: Major Boost for Handling Complex Coding Challenges

OpenAI starts to roll out a test for ads in ChatGPT today: Take a look at the new UI

Grok Climbs to #3 Rank in Global AI Traffic Rankings While Dominating Trading Benchmarks

ACE-Step v1.5 Reviews

Newly Added Tools