所需资源下载链接(资源免费,重在分享)
Tesseract:http://download.csdn.net/detail/chenyangqi/9190667
jai_imageio-1.1-alpha,swingx-1.0:http://download.csdn.net/detail/chenyangqi/9190683
HttpWatch Professional:http://download.csdn.net/detail/chenyangqi/9208339
项目简介:
我们学校使用的是校园锐捷客户端上网,本人因长时间不用密码,早已忘记密码,只记得是六位纯数字,又不想去服务中心换密码,就想用Http模拟一下锐捷网页版实现登陆,遍历一下000000-999999的密码。也顺便学习一下。废话不说了,直接进入主题,登陆页面如下中规中矩,用户名密码验证码:
1:下载安装Tesseract
本人已提供下载链接(文章开头处提供下载链接),下载好安装即可,压缩包内还提供了汉语语言包的,Tesseract是一款支持汉语识别的OCR,这也是我选择他的原因,至于安装和验证安装是否成功,可自行百度,我就不废话了。
我安装的位置为:D:\Program Files (x86)\Tesseract-OCR,该目录下的结构如下(其中tessdata就是存放语言包的位置)
二:验证码获取
下载HttpWatch(文章开头处提供下载链接),安装,并在IE中使用,对登陆页面进行抓包,找到验证码的URL,如下图(HttpWatch安装使用方法,自行百度);
三:Java实现验证码识别
代码如下,eclipse新建一个java项目,引入jai_imageio-1.1-alpha.jar,swingx-1.0.jar这两个包(文章开头处提供下载链接),导入到项目中。OK,准备写代码吧,共两个类ORC.class ImageIOhelper.class(只需修改一下你安装Tesseract的路径,可直接引入你的项目使用)。
OCR.class代码如下:recognizeText(File imageFile, String imageFormat)方法的参数,就是上一步下载的验证码图片在本地的位置。D://verifycode.jpg
package com.cyq.request;import java.io.BufferedReader;import java.io.File;import java.io.FileInputStream;import java.io.InputStreamReader;import java.util.ArrayList;import java.util.List;import org.jdesktop.swingx.util.OS;public class OCR { private final String LANG_OPTION = "-l"; // 英文字母小写l,并非数字1 private final String EOL = System.getProperty("line.separator"); private String tessPath = "D://Program Files (x86)//Tesseract-OCR";//Tesseract安装路径 public String recognizeText(File imageFile, String imageFormat) throws Exception { File tempImage = ImageIOHelper.createImage(imageFile, imageFormat); File outputFile = new File(imageFile.getParentFile(), "output"); StringBuffer strB = new StringBuffer(); List<String> cmd = new ArrayList<String>(); if (OS.isWindowsXP()) { cmd.add(tessPath + "//tesseract"); } else if (OS.isLinux()) { cmd.add("tesseract"); } else { cmd.add(tessPath + "//tesseract"); } cmd.add(""); cmd.add(outputFile.getName()); ProcessBuilder pb = new ProcessBuilder(); pb.directory(imageFile.getParentFile()); cmd.set(1, tempImage.getName()); pb.command(cmd); pb.redirectErrorStream(true); Process process = pb.start(); int w = process.waitFor(); // 删除临时正在工作文件 tempImage.delete(); if (w == 0) { BufferedReader in = new BufferedReader(new InputStreamReader( new FileInputStream(outputFile.getAbsolutePath() + ".txt"), "UTF-8")); String str; while ((str = in.readLine()) != null) { strB.append(str).append(EOL); } in.close(); } else { String msg; switch (w) { case 1: msg = "Errors accessing files.There may be spaces in your image's filename."; break; case 29: msg = "Cannot recongnize the image or its selected region."; break; case 31: msg = "Unsupported image format."; break; default: msg = "Errors occurred."; } tempImage.delete(); } new File(outputFile.getAbsolutePath() + ".txt").delete(); return strB.toString(); }}
ImageIOhelper.class代码如下:
package com.cyq.request;import java.awt.image.BufferedImage;import java.io.File;import java.io.IOException;import java.util.Iterator;import java.util.Locale;import javax.imageio.IIOImage;import javax.imageio.ImageIO;import javax.imageio.ImageReader;import javax.imageio.ImageWriteParam;import javax.imageio.ImageWriter;import javax.imageio.metadata.IIOMetadata;import javax.imageio.stream.ImageInputStream;import javax.imageio.stream.ImageOutputStream;import com.sun.media.imageio.plugins.tiff.TIFFImageWriteParam;public class ImageIOHelper { /** * 图片文件转换为tif格式 * * @param imageFile * 文件路径 * @param imageFormat * 文件扩展名 * @return */ public static File createImage(File imageFile, String imageFormat) { File tempFile = null; try { Iterator<ImageReader> readers = ImageIO .getImageReadersByFormatName(imageFormat); ImageReader reader = readers.next(); ImageInputStream iis = ImageIO.createImageInputStream(imageFile); reader.setInput(iis); IIOMetadata streamMetadata = reader.getStreamMetadata(); TIFFImageWriteParam tiffWriteParam = new TIFFImageWriteParam( Locale.CHINESE); tiffWriteParam.setCompressionMode(ImageWriteParam.MODE_DISABLED); Iterator<ImageWriter> writers = ImageIO .getImageWritersByFormatName("tiff"); ImageWriter writer = writers.next(); BufferedImage bi = reader.read(0); IIOImage image = new IIOImage(bi, null, reader.getImageMetadata(0)); tempFile = tempImageFile(imageFile); ImageOutputStream ios = ImageIO.createImageOutputStream(tempFile); writer.setOutput(ios); writer.write(streamMetadata, image, tiffWriteParam); ios.close(); writer.dispose(); reader.dispose(); } catch (IOException e) { e.printStackTrace(); } return tempFile; } private static File tempImageFile(File imageFile) { String path = imageFile.getPath(); StringBuffer strB = new StringBuffer(path); strB.insert(path.lastIndexOf('.'), 0); return new File(strB.toString().replaceFirst("(?<=//.)(//w+)$", "tif")); }}
Main方法中调用就OK了:
private static String getCode() { String valCode = null; String path = "d://verifycode.jpg"; try { valCode = new OCR().recognizeText(new File(path), "jpg");
System.out.println("验证码为:"+valCode) } catch (IOException e) { e.printStackTrace(); } catch (Exception e) { e.printStackTrace(); } return valCode; }
好啦,验证码识别over,至于整个Http登陆的请看我后一篇博文(http://www.cnblogs.com/chenyangqi/p/4906376.html)。
声明:该博文为博主原创,转载请注明出处
本程序模拟仅用于学习,请勿使用该内容从事违法活动和暴力破解活动
原标题:java识别验证码
关键词:JAVA