My Interview with Polycom's Jeff Rodman

Aside from being a co-founder of Polycom and CTO of their Voice Communications Solutions division, Jeff Rodman is often referred to as the "father of HD voice". That's a pretty strong calling card, and I recently had a chance to conduct an interview with him about HD voice and the overall direction he's seeing for voice in the IP telephony space.

At TMC's IT Expo earlier this month, I gave a couple of presentations on SIP Trunking, so I've been quite attuned recently to how well these two ideas fit together. When service providers offer SIP Trunking, the enterprise gets the benefit of end-to-end IP, and one of those benefits is support for HD and wideband codecs. In this environment, VoIP can live up to its potential of not just being on par with TDM, but actually being superior. That may sound hard to believe (no pun intended), but if you've heard the difference between HD and narrowband voice, you'll know exactly what I'm talking about.

Back in August, Polycom announced the availability of their royalty-free wideband codec, Siren 7 (as in 7 kHz), and have been a leading proponent of bringing HD audio into the everyday audioconferencing experience. Wearing my TMC hat for a moment, I should add that Polycom sponsors a portal devoted to HD audio on TMCnet. It's a great resource, and Jeff Rodman serves as the resident Ask-the-Expert resource.

Well, before you run off to their portal, you should spend some time with me, since I have Jeff right here - in prose, anyway. Below is my interview with Jeff, and if this doesn't convince you that HD is a great value-add, I'm not sure what will. Enjoy...

Q: Let’s start at a high level and talk about how voice communications is evolving in today’s enterprise. Voice is no longer the domain of the PBX and we have more ways to talk than ever before. What does this mean for our expectations of the end user experience?

A: With workforces becoming more dispersed, enterprises are depending more and more on voice communications to keep their critical activities closely linked. We don’t always have time to send a confirming e-mail or text, so agreeing to a meeting at 2:15 can turn into a significant problem when the other person heard 2:50; and yet, if you just say these two numbers to yourself, you can hear how they could be easily confused over a conventional phone connection. That’s where technologies such as HD Voice make the difference: because with this extended audio bandwidth, you can carry the full range of human speech, significantly decreasing the possibilities for miscommunications.

People are talking business at homes, in their offices, and on the road; background noise is ubiquitous, speakerphone get used in reverberant rooms, and sometimes it seems that every second person you talk to has an accent different than your own. In any of these scenarios, having the whole speech signal, instead of the one-fourth of it that conventional phones carry, makes a big difference. Which makes sense; our ears are designed to listen to the whole voice, not a small part of it.

Another thing is that although it’s always been essential, talk has become a part of a richer communication experience. Over multiple connections, we have access to tools for text, email, graphics exchange, high definition video, even real-time photo sharing. We expect to have realistic voice as well.

Q: As end users have more choices for communicating, how does this impact vendors like Polycom in terms of product development? Is it more important to focus on the underlying technologies and standards or the applications for end users?

A: Polycom remains focused on providing the best human communications, so we’re excited by the opportunity to deliver a richer communication experience to a growing variety of users and applications. As you say, end users have more choices, but to us here at Polycom, having more choices means that we have a deeper toolbox we can draw from to build those solutions. Telephony has burst forth as an IP butterfly after being a POTS caterpillar for fifty years; this is a really exciting time to us.

To your other question, the degree to which one should be interested in the technology itself really depends on who is looking at this. For the end user, it’s the experience that is important, within the context of their application, not the technology. A user should not need to be concerned with the underlying technologies; even the profound improvement of HD Voice, when compared with conventional narrowband audio, should just be transparently available to the end user.

Along those lines, we believe in providing a comprehensive communications experience that enables users not only to communicate in HD Voice, but also to take advantage of applications on their desktop phones in order to maximize productivity. To that end, we have enabled our phones to run applications that allow them to conduct business more efficiently by accessing critical data on their phones – for example, the corporate telephone directory – and to easily and intuitively manage conference calls as well as record voice conversations – all right on their phone.

For any organization, though, it’s the technologies and standards that make their jobs easier. Open Unified Communications means adherence to open standards to ensure a scalable and maintainable system architecture, and organizations wants to be sure that they’re also using innovative, reliable, and economical technologies as the basis for their solutions. For this reason, Polycom is an active participant in standards bodies such as the ITU, IEEE and Wi-Fi Alliance. We realize that in the end, the organization still needs to keep the users in mind. Technology and open standards are just the means to that end, serving the users, and this is true for both us as developers and manufacturers, and for the organizations that support and leverage their most valuable asset, their users.

Q: Wireless is becoming much more central for enterprise communications and a driver for this evolution process away from the traditional PBX. What trends are you seeing here, and where do you see wireless adding the most value for enterprises?

A: Enterprise wireless is becoming mainstream now that we’ve reached a mature stage with the technology and standards. Wi-Fi is already being deployed in a host of enterprise environments where mobile voice and data access is critical for key employees. Wi-Fi networks are secure and reliable enough for any enterprise application, and are a perfect complement to VoIP technology. Many of the early deployments of Wi-Fi in office environments were done only to support wireless data access, but today enterprises are planning their wireless infrastructure investments for both voice and data use.

The greatest value of a converged enterprise wireless solution is in improving employee productivity through better mobility and responsiveness. There are many more benefits to be realized as new applications and unified communication become integrated into the wired and wireless enterprise.

Q: Before focusing directly on the topic of wideband codecs, I’d like to hear your thoughts on what you’re seeing in the market in terms the disparities in quality between landline and mobile telephony. How much of an issue is this for enterprises, and how much are they looking to vendors like Polycom for solutions?

A: There’s a big difference in experience between wired and unwired telephony in the great uncontrolled outdoors, versus in a well-managed enterprise environment. Enterprises can make Wi-Fi network investments to meet their own requirements for call quality and capacity rather than relying on whatever wireless service providers have to offer. Deploying Wi-Fi can be much more cost effective over paying monthly airtime for a broadband data service that might have limited coverage and capacity. And as we all know, the cellphone remains the least reliable and poorest sounding telephone connection we can make, but its exceptional versatility still makes us love it.

However, this distinction disappears when we compare wired and enterprise-grade Wi-Fi phones, because Wi-Fi telephony in the enterprise delivers both high reliability and high quality - within the enterprise, you don’t have to surrender a dependable high-quality connection for mobility. Audio quality via Wi-Fi technology is robust and reliable when properly deployed within an enterprise setting, and even HD Voice can be carried over wireless links. It’s not only Wi-Fi, either; we expect to see audio bandwidth through cellphones improving over the next several years, driven by user demand and the availability of higher-fidelity devices, and enabled by high-efficiency wideband codecs such as G.722.2.

Q: As voice becomes integrated more and more with other workplace applications, the underlying technologies face new challenges. What do you see as the main challenges here and how well are they being addressed?

A: In the early days of VoIP, we experienced problems with jitter and packet loss, as enterprise networks were just beginning to understand the difference between conventional data and real-time media streams. Today however, those issues are well understood. With most enterprise IP networks having already moved from 10Mbps to 100 or 1000Mbps, and with best practices for VoIP being broadly disseminated, transitioning business telephony from dedicated analog or digital lines to a VoIP network is typically a fast and reliable process. This is fortunate, because business processes are taking advantage of the ability to link voice, video, and data in different ways.

The integrated address book on a desktop phone is one good example of bringing together two disparate forms of data via open standards, conveniently bridging two needs and two networks with one application, and on-screen conference call management is a good example of how the unified network can simultaneously manage, monitor, and document real-time, multi-participant collaboration.

Q: As communications technologies evolve, would you say that the real-time nature of voice is its strongest attribute, or are there other elements that are now more important such voice quality or a sense of social intimacy?

A: A good, fast back-and-forth discussion with someone quickly demonstrates how the immediacy of voice is still essential, but the transition to HD Voice within VoIP means that we can now hear the person as if they are in the same rooms as we are: their identity is clear, their words are unblurred. Whether hearing them accurately or being able to tell which person is speaking is more important probably depends on the nature of each particular interchange, but I think that with HD Voice over a real-time connection, we’re finally delivering the genuine and vivid speech experience that some of us have dreamed about providing for many years.

Q: I’d like to focus now on voice quality, and High Definition in particular. Most of the early challenges around the quality of VoIP have been addressed, but we’re not at the point yet where it can truly deliver a superior experience to TDM on a widespread scale. What are the key technologies that can enable this experience, especially around wideband codecs, and what can the market expect to see in the near future?

A: The big step up in sound quality is moving to HD Voice at 7 kHz. This captures an important part of the human voice, a whole range that’s missing in analog 3 kHz telephony. And it goes beyond that, too; on the other end, normal telephony doesn’t send much below 300 Hz, yet the human voice communicates below 100 Hz. These low frequencies convey presence, the feeling of “yeah, that person really feels like they’re here.” Three modern standard codecs are available today that can carry this whole range, G.722, G.722.1 (Siren7), and G.722.2 (AMR-WB), which give excellent sound quality while allowing a range of choices to fit each application. G.722 is already very common in VoIP endpoints from some manufacturers, and we will be seeing adoption of the others as lower bit rate and cellphone HD Voice compatibility become more popular.

One thing that is often misunderstood is how much data bandwidth is needed by wideband audio. Wideband codecs don’t actually need any higher data rates than G.711, and some are significantly less: G.722.1’s rate is half that of G.711, G.722.2 is a quarter. This efficiency is possible because these more recent algorithms can run economically on today’s in-phone processors, which was not the case ten years ago. That’s part of your question, I think: processors, microphones, speakers, and the network itself have all evolved to the point that end-to-end HD Voice is not significantly more of a burden than narrowband.

Demonstrating this point, many vendors today are shipping wideband VoIP phones, using the popular G.722 codec for 7 kHz audio. At this point, G.722 is the language that all wideband systems can speak, and we expect that to continue growing in popularity. However, we also expect to see growing deployment of the other codecs as well, and some phones will ship with all three of the “seven twenty-twos.” The challenge until now has been the ability of service providers to offer wideband telephony across their entire network. The good news is that more and more of our partners are doing just that, and we believe the benefits of wideband can soon be realized by the masses.

Q: What enterprise applications do you see having the most appeal and value for HD VoIP? How do you compare the appeal across various end user environments, and where do you see it bringing the most value – specifically, the desktop screen, the desk phone, the mobile phone, audio conferencing and video conferencing.

A: Because so much of our communication is voice communications, the ability of HD VoIP to make speech substantially clearer and more accurate is pervasive - any application in which one person is talking to another is tangibly benefited by the improved clarity of HD VoIP. Some of the places where this has the strongest effect are those where facts are being discussed (on conventional phone connections without HD Voice, there’s almost no audible difference between “sixty million” and “sixteen million”), or where people are dealing with accents or hands-free phones, such as in audio conferences.

HD Voice can compensate for a lot of the degradation that noise and room acoustics add in conference groups, and it can also greatly help understand someone who talks with a different accent than your own. Additionally, since listening is less fatiguing with HD Voice, phone meetings in all environments tend to remain more focused and productive.

And while much of this may sound theoretical, we’ve heard it firsthand from some of our customers – how much easier it is to do business with HD Voice. For example, one customer headquartered in Japan with offices around the globe, told us they communicate far more effectively now that they’re using HD Voice. Accents are that much easier to understand.

Q: How well do enterprises understand the notion of CEBP – communications enabled business processes? Is it in line with how vendors such as Polycom see it, and how well can CEBP deliver on its value proposition?

A: With the globally competitive business environment, companies are constantly seeking to improve productivity of their workforce. Businesses are just getting exposed to the CEBP concept in a broad sense. However, they actually have applied CEPB to niche applications for a long time. The simple act of paging a doctor to come back to the office is a CEBP that’s been around for more than 20 years!

What’s new is that the power of the Internet, with data and decision making in network based applications, is converging with the richness and ubiquitous presence of next generation communications devices to make CEBP more effective, and applicable in broader contexts.

The success of CEBP is going to be driven by melding the underlying objectives of a business process (for example- “improve patient response”) with the capabilities of the technology to define a new business process that achieves the objectives in a different, more optimal way, using communications devices. This is where having a broad set of solution partners in the actual business areas is enabling Polycom to deliver significant value around CEBP.

Q: Where do you see HD VoIP having the most impact for supporting CEBP going forward?

A: HD Voice has an essential role whenever voice is used as the information exchange medium in a business process. While written media like text and email are usually clear, it’s not hard to confuse, for example, “FCC” with “SEC” when spoken in a conventional phone message. Many business processes don’t easily make allowances for errors and repeats, so the consequences can be severe. Any time information transfer via human speech is an important part of a business flow, HD Voice should be seriously considered to increase efficiency, improve speed, and reduce mistakes.

Q: What’s the best thing that could happen now to accelerate the adoption or demand for HD VoIP?

A: With increased business focus on distributed workforces and travel reduction, HD Voice plays a big role in keeping businesses well connected because it restores the “like being there” quality to informal conversations and formal discussions. We’re at an interesting point in the evolution of HD Voice. Compatible, standards-based technology exists, is openly available, and it is being deployed in numerous endpoints, service providers and networks, but full end-to-end connectivity is still building within the cloud. As vendors and service providers hear that their customers want HD Voice, they are accelerating its incorporation in their offerings; one way to help this happen is to just let your provider know that you want HD Voice.

Q: As a vendor, what’s the most important message about voice communications you’d like to see enterprises take away from our interview?

A: I think nobody’s going to argue that voice is a serious business communications tool, so the important things to remember are that HD Voice brings a major improvement in the quality and efficiency of this tool, and that HD Voice is cost-effective, reliable and interoperable, compatible with your VoIP network, and is available today. The bottom line is that we experience high definition audio and video in our daily lives, and we EXPECT it in our homes when we watch TV or listen to music, in entertainment venues such as sporting events, movie theatres, etc., and on the go with iPods, satellite radio, etc. So why not expect an HD experience at work?

Everyone agrees the HD experience is superior. I think now it’s just a matter of the workplace catching up technologically with our lives outside the office. And this is a change in and of itself. It used to be that the newest, best technologies were only available at work. As a younger workforce enters the business world, this will change, and we’re very excited to be there to see and hear about these changes as they occur – all in HD, of course.

On the applications front, we’re also seeing some uptake and tremendous opportunity in the marketplace. If you look at how we use our mobile phones today, they are far more than voice devices. We use them for SMS, Web browsing and email. Polycom realizes that end users want their phones to offer more than just voice, and we’re making sure we offer that today with additional productivity enhancing apps coming soon. In the end, we just want to make it easier and faster for our customers to get the job done.