Closed
Description
Removing an HTML document's internal_subset
leaks memory.
What's the output from nokogiri -v
?
# Nokogiri (1.7.0.1)
---
warnings: []
nokogiri: 1.7.0.1
ruby:
version: 2.4.4
platform: x86_64-darwin17
description: ruby 2.4.4p296 (2018-03-28 revision 63013) [x86_64-darwin17]
engine: ruby
libxml:
binding: extension
source: packaged
libxml2_path: "/Users/steve/programming/nokogumbo/vendor/bundle/ruby/2.4.0/gems/nokogiri-1.7.0.1/ports/x86_64-apple-darwin17.4.0/libxml2/2.9.4"
libxslt_path: "/Users/steve/programming/nokogumbo/vendor/bundle/ruby/2.4.0/gems/nokogiri-1.7.0.1/ports/x86_64-apple-darwin17.4.0/libxslt/1.1.29"
libxml2_patches: []
libxslt_patches: []
compiled: 2.9.4
loaded: 2.9.4
Can you provide a self-contained script that reproduces what you're seeing?
#!/usr/bin/env ruby
# encoding: utf-8
require 'nokogiri'
1_000_000.times do |i|
doc = Nokogiri::HTML::Document.new
doc.internal_subset.remove
end
Here're the two leaks.
Leak: 0x7fddedc00250 size=16 zone: DefaultMallocZone_0x10cdff000
Call stack: [thread 0x7fff98b0b380]: | 0x7fff60324015 (libdyld.dylib) start | 0x10cab9f3b (ruby) main | 0x10cb0d71d (libruby.2.4.dylib) ruby_run_node | 0x10cb0d7ec (libruby.2.4.dylib) ruby_exec_internal | 0x10cc02df4 (libruby.2.4.dylib) vm_exec | 0x10cbf7158 (libruby.2.4.dylib) vm_exec_core | 0x10cc06729 (libruby.2.4.dylib) vm_call_cfunc | 0x10cb52dbb (libruby.2.4.dylib) int_dotimes | 0x10cbff937 (libruby.2.4.dylib) rb_yield_1 | 0x10cc0c0a2 (libruby.2.4.dylib) invoke_block_from_c_splattable | 0x10cc02df4 (libruby.2.4.dylib) vm_exec | 0x10cbf7781 (libruby.2.4.dylib) vm_exec_core | 0x10cc06729 (libruby.2.4.dylib) vm_call_cfunc | 0x10cf24e2a (nokogiri.bundle) new | 0x10cfb4e99 (nokogiri.bundle) htmlNewDoc | 0x10cfb4c5d (nokogiri.bundle) htmlNewDocNoDtD | 0x10cf818c7 (nokogiri.bundle) xmlCreateIntSubset | 0x10d016e6a (nokogiri.bundle) xmlStrdup | 0x10d016d8d (nokogiri.bundle) xmlStrndup | 0x10cb22de2 (libruby.2.4.dylib) objspace_xmalloc0 | 0x7fff604cc4c7 (libsystem_malloc.dylib) malloc | 0x7fff604cd1e1 (libsystem_malloc.dylib) malloc_zone_malloc
Leak: 0x7fddedc00450 size=48 zone: DefaultMallocZone_0x10cdff000
Call stack: [thread 0x7fff98b0b380]: | 0x7fff60324015 (libdyld.dylib) start | 0x10cab9f3b (ruby) main | 0x10cb0d71d (libruby.2.4.dylib) ruby_run_node | 0x10cb0d7ec (libruby.2.4.dylib) ruby_exec_internal | 0x10cc02df4 (libruby.2.4.dylib) vm_exec | 0x10cbf7158 (libruby.2.4.dylib) vm_exec_core | 0x10cc06729 (libruby.2.4.dylib) vm_call_cfunc | 0x10cb52dbb (libruby.2.4.dylib) int_dotimes | 0x10cbff937 (libruby.2.4.dylib) rb_yield_1 | 0x10cc0c0a2 (libruby.2.4.dylib) invoke_block_from_c_splattable | 0x10cc02df4 (libruby.2.4.dylib) vm_exec | 0x10cbf7781 (libruby.2.4.dylib) vm_exec_core | 0x10cc06729 (libruby.2.4.dylib) vm_call_cfunc | 0x10cf24e2a (nokogiri.bundle) new | 0x10cfb4e99 (nokogiri.bundle) htmlNewDoc | 0x10cfb4c5d (nokogiri.bundle) htmlNewDocNoDtD | 0x10cf819aa (nokogiri.bundle) xmlCreateIntSubset | 0x10d016e6a (nokogiri.bundle) xmlStrdup | 0x10d016d8d (nokogiri.bundle) xmlStrndup | 0x10cb22de2 (libruby.2.4.dylib) objspace_xmalloc0 | 0x7fff604cc4c7 (libsystem_malloc.dylib) malloc | 0x7fff604cd1e1 (libsystem_malloc.dylib) malloc_zone_malloc
This is causing problems for users of Nokogumbo (see rubys/nokogumbo#20).
Metadata
Assignees
Labels
No labels