codeql-java测试

前言

codeql本质上还是查询语言，语法和sql语句类似
codeql主要就是利用分析引擎分析代码之间的关系，生成一个代码数据库。然后我们直接写ql就可以进行各种查询，如找某个方法或者某个类，找方法引用，跟踪某个参数的传递等等用法
codeql里面的谓词就是把各种条件封装成方法，需要个人理清需要查询的条件才能知道自己想要什么

编译数据库常用的命令

跳过测试，构建
--command="mvn clean install --file pom.xml -Dmaven.test.skip=true"
无论项目结果如何,构建从不失败
--command="mvn -fn clean install --file pom.xml -Dmaven.test.skip=true"

针对编译型语言来说，要在创建索引数据库的时候增加编译的功能，主要是针对java，对于非编译性的语言来说，直接扫描。go 既可编译，也可以不编译。

基本查询条件

过滤 Method

1
2
3

from Method method
where method.hasName("parseObject")
select method

查询这个方法的类名(className)

import java

from Method method
where method.hasName("parseObject")
select method,method.getDeclaringType()

查询Method name 和 interface name

比如我想查询ContentTypeHandler 的所有子类toObject方法

import java

from Method method
where method.hasName("toObject") and method.getDeclaringType().getASupertype().hasQualifiedName("org.apache.struts2.rest.handler", "ContentTypeHandler")
select method

getDeclaringType() //获取类名

getDeclaringType().getASupertype() //获取类名继承的接口

hasQualifiedName // 获取继承类型的包的名字

Call和Callable

Callable表示可调用的方法或构造器的集合。

Call表示调用Callable的这个过程（方法调用，构造器调用等等）

Expr

Expr —> 参数

MethodAccess

用在过滤方法调用上

一般是先查method，然后与MethodAccess.getMethod() 进行比较。

比如查ContentTypeHandler 的 toObject() 方法的调用。

import java

from MethodAccess call, Method method
where method.hasName("toObject") and method.getDeclaringType().getASupertype().hasQualifiedName("org.apache.struts2.rest.handler", "ContentTypeHandler") and call.getMethod() = method
select call

这种方式只能查到JsonLibHandler 这样显式定义的

改进可以使用`getAnAncestor()` 或者`getASupertype()*`

import java

from MethodAccess call, Method method
where method.hasName("toObject") and method.getDeclaringType().getAnAncestor().hasQualifiedName("org.apache.struts2.rest.handler", "ContentTypeHandler") and call.getMethod() = method
select call

过滤构造方法

假设代码中new File 为我们的sink点，可以构造ql

class FileContruct extends ClassInstanceExpr{
    FileContruct(){
        this.getConstructor().getDeclaringType*().hasQualifiedName("java.io", "File")
    }
}

数据流追踪

Local Data Flow分析SPEL

本地数据流本地数据流是单个方法(一旦变量跳出该方法即为数据流断开)或可调用对象中的数据流。本地数据流通常比全局数据流更容易、更快、更精确。

import java
import semmle.code.java.frameworks.spring.SpringController
import semmle.code.java.dataflow.TaintTracking
from Call call,Callable parseExpression,SpringRequestMappingMethod route
where
    call.getCallee() = parseExpression and 
    parseExpression.getDeclaringType().hasQualifiedName("org.springframework.expression", "ExpressionParser") and
    parseExpression.hasName("parseExpression") and 
   TaintTracking::localTaint(DataFlow::parameterNode(route.getARequestParameter()),DataFlow::exprNode(call.getArgument(0))) 
select route.getARequestParameter(),call

全局数据流分析要继承DataFlow::Configuration 这个类，然后重载isSource 和isSink 方法

class MyConfig extends DataFlow::Configuration {
  MyConfig() { this = "Myconfig" }
  override predicate isSource(DataFlow::Node source) {
    ....
    
  }

    override predicate isSink(DataFlow::Node sink) {
    ....
    
  }
}

from VulConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select sink.getNode(), source, sink, "source are"

数据流断的原因

外部的方法，因为没有编译到数据库中，这个是最常见的，基本上市面上的扫描器都存在这个问题，说起来复杂，原因大概是因为构建数据流会随着扫描AST的复杂程度递增导致数据库过大，最后大家在时间和易用性上做了平衡，选择了编译直接依赖的内容进行查询，从而导致这个问题的存在。 —–> maven直接导入的依赖并不能被扫到
复杂的字符串拼接,例如append，一些其他的字符串赋值，这个一般出场都是空的，要自己去搞，当然会有一些类似fortify的自带了部分场景的连接，不过有的时候要自己去排查
强制类型转换

isAddtionalStep技巧

isAddtionalStep使用就用最简单的二分法来定位，先前移sink，然后检测出来的话就移动到后面，直到找到哪个断开的地方。冷知识：数据流是可以混用的，例如我们的sink又可以是一个hasFlow表达式

例子(struts2)

出现问题是因为 Apache Struts 框架的一部分包括接受多种不同格式或内容类型的请求的能力。它提供了一个通过接口支持这些内容类型的可插拔系统ContentTypeHandler，它提供了如下接口方法：

/**
 * Populates an object using data from the input stream
 * @param in The input stream, usually the body of the request
 * @param target The target, usually the action class
 * @throws IOException If unable to write to the output stream
 */
void toObject(Reader in, Object target) throws IOException;

新的内容类型处理程序是通过实现接口和定义一个toObject方法来定义的，该方法以指定的内容类型（以 a 的形式Reader）获取数据并使用它来填充 Java 对象target，通常通过反序列化例程。但是，该in参数通常是从请求的正文中填充的，没有经过清理或安全检查。这意味着它应该被视为“不受信任”的用户数据，并且仅在某些安全条件下进行反序列化。

查找XML反序列化

XStream是一个 Java 框架，用于将 Java 对象序列化为 Apache Struts 使用的 XML。它提供了一种XStream.fromXML将 XML 反序列化为 Java 对象的方法。默认情况下，输入不会以任何方式进行验证，并且容易受到远程代码执行漏洞的攻击。在本节中，我们将识别fromXML代码库中的调用。

具体步骤如下：

查找程序中的所有方法调用

import java

from MethodAccess call
select call

更新查询以报告每个方法调用所调用的方法。

import java

from MethodAccess call, Method method
where call.getMethod() = method
select call, method

查找程序中对方法调用的所有调用fromXML

import java

from MethodAccess fromXML, Method method
where
  fromXML.getMethod() = method and
  method.getName() = "fromXML"
select fromXML

该XStream.fromXML方法反序列化第一个参数（即 index 处的参数0）。更新您的查询以报告反序列化的参数

import java

from MethodAccess fromXML,Expr expr
where fromXML.getMethod().getName()="fromXML" and
      expr=fromXML.getArgument(0)
select fromXML,expr

或者我们可以用谓词封装的逻辑进行测试

predicate isXMLDeserialized(Expr arg) {
  exists(MethodAccess fromXML |
    fromXML.getMethod().getName() = "fromXML" and
  arg = fromXML.getArgument(0)
  )
}

从 ContentTypeHandler 中查找 toObject 方法的实现

创建一个调用ContentTypeHandler来查找接口的 CodeQL 类org.apache.struts2.rest.handler.ContentTypeHandler

import java

/** The interface `org.apache.struts2.rest.handler.ContentTypeHandler`. */
class ContentTypeHandler extends RefType {
  ContentTypeHandler() {
    this.hasQualifiedName("org.apache.struts2.rest.handler", "ContentTypeHandler")
  }
}

创建一个 CodeQL 类ContentTypeHandlerToObject，用于识别在直接超类型包括的类上调用的Methods

class ContentTypeHandlerToObject extends Method {
  ContentTypeHandlerToObject() {
    this.getDeclaringType().getASupertype() instanceof ContentTypeHandler and
    this.hasName("toObject")
  }
}

toObject方法应将第一个参数视为不受信任的用户输入。编写查询以查找toObject方法的第一个（即索引 0）参数。
1
2
from ContentTypeHandlerToObject toObjectMethod
select toObjectMethod.getParameter(0)

数据流追踪

官方说可以用下面这个模板

/**
 * @name Unsafe XML deserialization
 * @kind problem
 * @id java/unsafe-deserialization
 */
import java
import semmle.code.java.dataflow.DataFlow

// TODO add previous class and predicate definitions here

class StrutsUnsafeDeserializationConfig extends DataFlow::Configuration {
  StrutsUnsafeDeserializationConfig() { this = "StrutsUnsafeDeserializationConfig" }
  override predicate isSource(DataFlow::Node source) {
    exists(/** TODO fill me in **/ |
      source.asParameter() = /** TODO fill me in **/
    )
  }
  override predicate isSink(DataFlow::Node sink) {
    exists(/** TODO fill me in **/ |
      /** TODO fill me in **/
      sink.asExpr() = /** TODO fill me in **/
    )
  }
}

from StrutsUnsafeDeserializationConfig config, DataFlow::Node source, DataFlow::Node sink
where config.hasFlow(source, sink)
select sink, "Unsafe XML deserialization"

确定source

override predicate isSource(Node source) {
  exists(ContentTypeHandlerToObject toObjectMethod |
    source.asParameter() = toObjectMethod.getParameter(0)
  )
}

确定 sink
1
2
3
4
5
6
override predicate isSink(Node sink) {
exists(Expr arg |
isXMLDeserialized(arg) and
sink.asExpr() = arg
)
}
对于这个结果，很容易验证它是否正确，因为 source 和 sink 都在同一个方法中。但是，对于许多数据流问题，情况并非如此。

我们可以更新查询，使其不仅报告接收器，还报告源和该源的路径。我们可以通过进行以下更改来做到这一点：答案是将查询转换为路径问题查询。我们需要改变五个部分：
- 将@kindfrom转换problem为path-problem. 这告诉 CodeQL 工具链将此查询的结果解释为路径结果。
- 添加一个新的 import DataFlow::PathGraph，它将在查询结果旁边报告路径数据。
- 将变量从更改为source，以确保节点保留路径信息。sink``DataFlow::Node``DataFlow::PathNode
- 使用hasFlowPath而不是hasFlow.
- 更改选择以将sourceand报告sink为第二列和第三列。工具链将这些数据与路径信息相结合PathGraph以构建路径。

加入路径查询

/**
* @name Unsafe XML deserialization
* @kind path-problem
* @id java/unsafe-deserialization
*/
import java
import semmle.code.java.dataflow.DataFlow
import DataFlow::PathGraph

predicate isXMLDeserialized(Expr arg) {
  exists(MethodAccess fromXML |
    fromXML.getMethod().getName() = "fromXML" and
    arg = fromXML.getArgument(0)
  )
}

/** The interface `org.apache.struts2.rest.handler.ContentTypeHandler`. */
class ContentTypeHandler extends RefType {
  ContentTypeHandler() {
    this.hasQualifiedName("org.apache.struts2.rest.handler", "ContentTypeHandler")
  }
}

/** A `toObject` method on a subtype of `org.apache.struts2.rest.handler.ContentTypeHandler`. */
class ContentTypeHandlerToObject extends Method {
  ContentTypeHandlerToObject() {
    this.getDeclaringType().getASupertype() instanceof ContentTypeHandler and
    this.hasName("toObject")
  }
}

class StrutsUnsafeDeserializationConfig extends DataFlow::Configuration {
  StrutsUnsafeDeserializationConfig() { this = "StrutsUnsafeDeserializationConfig" }
  override predicate isSource(DataFlow::Node source) {
    exists(ContentTypeHandlerToObject toObjectMethod |
      source.asParameter() = toObjectMethod.getParameter(0)
    )
  }
  override predicate isSink(DataFlow::Node sink) {
    exists(Expr arg |
      isXMLDeserialized(arg) and
      sink.asExpr() = arg
    )
  }
}

from StrutsUnsafeDeserializationConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select sink, source, sink, "Unsafe XML deserialization"

参考文章

https://github.com/githubsatelliteworkshops/codeql/blob/master/java.md

codeql-java测试

前言

编译数据库常用的命令

基本查询条件

过滤 Method

查询这个方法的类名(className)

查询Method name 和 interface name

Call和Callable

Expr

MethodAccess

改进可以使用getAnAncestor() 或者getASupertype()*

过滤构造方法

数据流追踪

数据流断的原因

isAddtionalStep技巧

例子(struts2)

查找XML反序列化

从 ContentTypeHandler 中查找 toObject 方法的实现

数据流追踪

参考文章

改进可以使用`getAnAncestor()` 或者`getASupertype()*`