A general rule for HO engines over the last 20 years is to pack the shell with as much weight as possible. This allows the engine to pull the most train. When the concept of DCC ready originated, room was allowed for a decoder. With sound, adding speakers is a challenge. Usually requires some milling of the weight. Recently people have been using iPhone speakers with amazing results. These are very thin, so the amount of milling is reduced.
I mention these, because in general sound is connected to a decoder module. All engines with decoders will run on DC powered track. You can not have any pulse wave from the supply. That confuses the electronics. The decoder will play a predetermined sound program based on power setting. The control of the sound is less than you get with DCC, but you have sound. The other advantage is when you go to DCC, if you ever do, you have a DCC compatibility on those sound engines.
If you do not see yourself going to DCC, then the are a few other options. Rail Pro is one. Bachmann is another. There is a third that I can not remember the name right now. These all require a module in the engine, similar to the decoder. They also require a speaker system as well. The difference is the DC power supply is always on max. The module electronics controls the engine in a manner similar to DCC. The space issues are the same as with the DCC system.
A third option is the under layout systems, like rolling thunder and I believe Kato has one. The latter is more focused on N scale today and the space issues are really critical there. These have the advantage of being separate systems, so the engines are not impacted. Obviously the disadvantage is the sound does not go with each engine.
There are lots to consider, not the least of which is it sound or just noise. Not an easy choice.
Larry
www.llxlocomotives.com