Ethics in AI pair programming

Tabnine Team | 3 min read | June 22, 2022

As ethics questions go, the judgement of AI applicability to programming might not rate all that high on the list of thorny moral conundrums. But AI is a part of our lives now. From the curated on our Twitter feeds to the music we listen to. In a technologically driven world it’s hard to find an activity where we don’t have machine learning code somewhere close at hand or actively assisting our pursuits whether we like it or not. Still are there ethical considerations that Tabnine, or any company involved in AI code assistance, should think about? I think the answer is an absolute yes!

Framing the problem is a good place to start. Where do most companies in need of solid programming skills find talented folks? Really good code hands are hard to come by these days and awfully expensive when you can get them onboard. If it works you can find talented folks overseas, or remote, but that does introduce an extra layer of complexity in managing the team and due to compliance reasons isn’t always an option. So if we are short on skills, and availability of people becomes the long pole in a product development roadmap what are the options that we are left with? Languages are becoming easier to use. Functional libraries in Java, Javascript, Python or any of the many modern languages can help wrap up complex code and make things faster, but bless those Haskell devs out there…

Still, transferring knowledge, refactoring code to address tech debt and documentation requires people with hands on keyboards and if there just are not enough of them to go around we have to fall back on technology. This is where AI enabled tools for code can come into play. Done correctly ML based models for a company can have a good sense of existing code, can be contextually aware of comments in the code and help a new hire avoid mistakes that might slow down development.

However, ML doesn’t live in a vacuum and it’s only as good as the data it’s trained on. But that code was likely written by someone. Probably by someone in the company, either currently or in the past. In this case AI code enhancement is pretty cut and dried. But it isn’t the only source of quality code. Open source is another data source for training your company’s algorithm and if done according to appropriate licensing permissions this too can be a good place to get new libraries and best practices.

But it’s not a free for all out there, it’s vitally important to examine where code from outside your organization is coming from. First for provenance and quality, but perhaps just as importantly has it been released for the purposes of our model training? Back when we were all taking CS classes in college we knew how frustrating it was to debug an assignment, and how much easier it would be to cut and paste. Plagiarism isn’t really an easily detectable fault in coding. So perhaps it’s easy to see the issues in copying code via AI for which the creator granted no permissions.

Can we fail in other ways though? If it becomes easier and easier to allow an algorithm to write code for us, are we not opening ourselves up to the ethical failure of attentiveness and stewardship of our code? Propagating a zero day is an easy fault to see happening, but what of innovation? AI assisted development helps us average folks achieve better code but it also removes the frustration and unique thinking that can come along with solving a problem in a new way. We run the risk of making our code more similar, more stagnant and possibly more vulnerable.

Ethics in coding and the new wrinkles that AI brings to this arena are far from easy to sort out. There may be more socially visible implications of AI, but as the world continues to become more entwined with technology the ethics of the code that powers that technology should also be a part of our discussion.